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Foreword 


Beginning in the spring of 2000， a series of four one-semester courses 
were taught at Princeton University whose purpose was to present, in 
an integrated manner, the core areas of analysis. The objective was to 
make plain the organic unity that exists between the various parts of the 
subject, and to illustrate the wide applicability of ideas of analysis to 
other fields of mathematics and science. The present series of books is 
an elaboration of the lectures that were given. 

While there are a number of excellent texts dealing with individual 
parts of what we cover, our exposition aims at a different goal: pre¬ 
senting the various sub-areas of analysis not as separate disciplines, but 
rather as highly interconnected. It is our view that seeing these relations 
and their resulting synergies will motivate the reader to attain a better 
understanding of the subject as a whole. With this outcome in mind，we 
have concentrated on the main ideas and theorems that have shaped the 
field (sometimes sacrificing a more systematic approach), and we have 
been sensitive to the historical order in which the logic of the subject 
developed. 

We have organized our exposition into four volumes, each reflecting 
the material covered in a semester. Their contents may be broadly sum¬ 
marized as follows: 

I. Fourier series and integrals. 

II. Complex analysis. 

III. Measure theory, Lebesgue integration, and Hilbert spaces. 

IV. A selection of further topics，including functional analysis, distri¬ 
butions, and elements of probability theory. 

However, this listing does not by itself give a complete picture of 
the many interconnections that are presented, nor of the applications 
to other branches that are highlighted. To give a few examples: the ele¬ 
ments of (finite) Fourier series studied in Book I， which lead to Dirichlet 
characters, and from there to the infinitude of primes in an arithmetic 
progression; the X-ray and Radon transforms，which arise in a number of 
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problems in Book I， and reappear in Book III to play an important role in 
understanding Besicovitch-like sets in two and three dimensions; Fatou’s 
theorem, which guarantees the existence of boundary values of bounded 
holomorphic functions in the disc, and whose proof relies on ideas devel¬ 
oped in each of the first three books; and the theta function, which first 
occurs in Book I in the solution of the heat equation, and is then used 
in Book II to find the number of ways an integer can be represented as 
the sum of two or four squares, and in the analytic continuation of the 
zeta function. 

A few further words about the books and the courses on which they 
were based. These courses where given at a rather intensive pace，with 48 
lecture-hours a semester. The weekly problem sets played an indispens¬ 
able part, and as a result exercises and problems have a similarly im¬ 
portant role in our books. Each chapter has a series of “Exercises” that 
are tied directly to the text, and while some are easy, others may require 
more effort. However, the substantial number of hints that are given 
should enable the reader to attack most exercises. There are also more 
involved and challenging “Problems ”； the ones that are most difficult，or 
go beyond the scope of the text, are marked with an asterisk. 

Despite the substantial connections that exist between the different 
volumes, enough overlapping material has been provided so that each of 
the first three books requires only minimal prerequisites: acquaintance 
with elementary topics in analysis such as limits, series, differentiable 
functions, and Riemann integration, together with some exposure to lin¬ 
ear algebra. This makes these books accessible to students interested 
in such diverse disciplines as mathematics, physics, engineering, and 
finance，at both the undergraduate and graduate level. 

It is with great pleasure that we express our appreciation to all who 
have aided in this enterprise. We are particularly grateful to the stu¬ 
dents who participated in the four courses. Their continuing interest, 
enthusiasm, and dedication provided the encouragement that made this 
project possible. We also wish to thank Adrian Banner and Jose Luis 
Rodrigo for their special help in running the courses, and their efforts to 
see that the students got the most from each class. In addition, Adrian 
Banner also made valuable suggestions that are incorporated in the text. 
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Preface to Book IV 


Functional analysis, as generally understood, brought with it a change 
of focus from the study of functions on everyday geometric spaces such 
as R, etc., to the analysis of abstract infinite-dimensional spaces, for 
example, functions spaces and Banach spaces. As such it established a 
key framework for the development of modern analysis. 

Our first goal in this volume is to present the basic ideas of this theory, 
with particular emphasis on their connection to harmonic analysis. A 
second objective is to provide an introduction to some further topics to 
which any serious student of analysis ought to be exposed: probability 
theory, several complex variables and oscillatory integrals. Our choice of 
these subjects is guided, in the first instance, by their intrinsic interest. 
Moreover, these topics complement and extend ideas in the previous 
books in this series, and they serve our overarching goal of making plain 
the organic unity that exists between the various parts of analysis. 

Underlying this unity is the role of Fourier analysis in its interrelation 
with partial differential equations, complex analysis, and number theory. 
It is also exemplified by some of the specific questions that arose initially 
in the previous volumes and that are taken up again here: namely, the 
Dirichlet problem, ultimately treated by Brownian motion; the Radon 
transform, with its connection to Besicovitch sets; nowhere differentiable 
functions; and some problems in number theory, now formulated as dis¬ 
tributions of lattice points. We hope that this choice of material will not 
only provide a broader view of analysis, but will also inspire the reader 
to pursue the further study of this subject. 




U Spaces and Banach Spaces 


In this work the assumption of quadratic integrability 
will be replaced by the integrability of |/(x)| p . The 
analysis of these function classes will shed a particu¬ 
lar light on the real and apparent advantages of the 
exponent 2; one can also expect that it will provide 
essential material for an axiomatic study of function 
spaces. 

F. Riesz, 1910 


At present I propose above all to gather results about 
linear operators defined in certain general spaces, no¬ 
tably those that will here be called spaces of type (B)... 

S. Banach, 1932 


Function spaces, in particular LP spaces, play a central role in many 
questions in analysis. The special importance of L p spaces may be said 
to derive from the fact that they offer a partial but useful generalization 
of the fundamental L 2 space of square integrable functions. 

In order of logical simplicity, the space L 1 comes first since it occurs 
already in the description of functions integrable in the Lebesgue sense. 
Connected to it via duality is the L°° space of bounded functions, whose 
supremum norm carries over from the more familiar space of continuous 
functions. Of independent interest is the L 2 space, whose origins are 
tied up with basic issues in Fourier analysis. The intermediate L p spaces 
are in this sense an artifice, although of a most inspired and fortuitous 
kind. That this is the case will be illustrated by results in the next and 
succeeding chapters. 

In this chapter we will concentrate on the basic structural facts about 
the L p spaces. Here part of the theory, in particular the study of their 
linear functionals, is best formulated in the more general context of Ba¬ 
nach spaces. An incidental benefit of this more abstract view-point is 
that it leads us to the surprising discovery of a finitely additive measure 
on all subsets, consistent with Lebesgue measure. 
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Chapter 1. L p SPACES AND BANACH SPACES 


1 LP spaces 

Throughout this chapter (X, T, fi) denotes a cr-finite measure space: X 
denotes the underlying space, T the cr-algebra of measurable sets，and /i 
the measure. If 1 < p < cx), the space L P (X^ T consists of all complex¬ 
valued measurable functions on X that satisfy 

⑴ f |/(x)| p dfi(x) < oo. 

Jx 

To simplify the notation，we write L P (X,/i), or L P (X)^ or simply L p 
when the underlying measure space has been specified. Then，if / G 
L p [X ， J 7 ， fi) we define the L p norm of / by 


II/IIlw，ao = 



We also abbreviate this to 1 |/||lp(x )， ||/||lp ? or ||/|| p . 

When p — 1 the space L l {X^ T^ fi) consists of all integrable functions 
on and we have shown in Chapter 6 of Book III, that L 1 together with 
|| • H^i is a complete normed vector space. Also, the case p — 2 warrants 
special attention: it is a Hilbert space. 

We note here that we encounter the same technical point that we al¬ 
ready discussed in Book III. The problem is that ||/||lp =0 does not 
imply that / = 0, but merely / = 0 almost everywhere (for the measure 
/i). Therefore, the precise definition of L p requires introducing the equiv¬ 
alence relation, in which / and g are equivalent if f = g a.e. Then, L p 
consists of all equivalence classes of functions which satisfy (1). However, 
in practice there is little risk of error by thinking of elements in L p as 
functions rather than equivalence classes of functions. 

The following are some common examples of L p spaces. 

(a) The case X = R d and /i equals Lebesgue measure is often used in 
practice. There, we have 


II/IIlp = 



f(x)\ p dx 


1/p 


(b) Also, one can take 义二 Z，and /i equal to the counting measure. 
Then, we get the “discrete” version of the L p spaces. Measurable 
functions are simply sequences / = {a n } ne z of complex numbers, 
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and 



When p = 2, we recover the familiar sequence space ^ 2 (Z). 

The spaces L p are examples of normed vector spaces. The basic prop¬ 
erty satisfied by the norm is the triangle inequality, which we shall prove 
shortly. 

The range of p which is of interest in most applications is 1 < p < cx), 
and later also p = oo. There are at least two reasons why we restrict our 
attention to these values of p: when 0 < p < 1, the function || • \\lp does 
not satisfy the triangle inequality, and moreover, for such p, the space 
L p has no non-trivial bounded linear functionals. 1 (See Exercise 2.) 

When p = l the norm || • ||p satisfies the triangle inequality, and L 1 
is a complete normed vector space. When p = 2, this result continues to 
hold, although one needs the Cauchy-Schwarz inequality to prove it. In 
the same way, for 1 < p < cx) the proof of the triangle inequality relies on 
a generalized version of the Cauchy-Schwarz inequality. This is Holder^ 
inequality, which is also the key in the duality of the L p spaces, as we 
will see in Section 4. 


1.1 The Holder and Minkowski inequalities 

If the two exponents p and q satisfy 1 < p,q < oo, and the relation 

V Q 

holds，we say that p and q are conjugate or dual exponents. Here, 
we use the convention l/oo = 0. Later, we shall sometimes use to 
denote the conjugate exponent of p. Note that p = 2 is self-dual, that is, 
P = q = 2; also p = l^oo corresponds to g = oo, 1 respectively. 

Theorem 1.1 (Holder) Suppose 1 < p < oo and 1 < q < oo are conju- 
9 a te exponents. If f L p and g € L q f then fg G L 1 and 

\\f9\\L^<\\fh49\\L^ 

Note. Once we have defined L°° (see Section 2) the corresponding in¬ 
equality for the exponents 1 and oo will be seen to be essentially trivial. 
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The proof of the theorem relies on a simple generalized form of the 
arithmetic-geometric mean inequality: A^B > 0, and 0 < 0 < 1, then 

( 2 ) A e B l -° <9A + (l-e)B. 

Note that when 9 = 1/2, the inequality (2) states the familiar fact that 
the geometric mean of two numbers is majorized by their arithmetic 
mean. 

To establish (2)，we observe first that we may assume 5^0, and 
replacing A by AB^ we see that it suffices to prove that A 6 < 6A + (1 — 
9). If we let f(x) = x° — 9x — (1 — 0), then f f (x) = 6(x e ~ l — 1). Thus 
f(x) increases when 0 < x < 1 and decreases when 1 < x, and we see that 
the continuous function / attains a maximum at x = 1, where /(l) = 0. 
Therefore f(A) < 0, as desired. 

To prove Holder^ inequality we argue as follows. If either = 0 

or Il/H^q = 0, then fg = 0 a.e. and the inequality is obviously verified. 
Therefore, we may assume that neither of these norms vanish, and after 
replacing / by //||/||lp and g by 分 /1| 分 ||z^, we may further assume that 
II/IIlp = II^IIl q = 1. We now need to prove that H/^Hl 1 ^ 1- 
If we set A = |/(x)| p , B = \g(x)\ q ^ and 0 = 1/p so that 1 — 0 = 1/q, 
then (2) gives 

|/(x)^)|<-|/(x)r+ -1^x)1^ 

P Q 

Integrating this inequality yields ||/^|| L i < 1, and the proof of the Holder 
inequality is complete. 

For the case when the equality ||/^|| L i = ||/||lpII^IIl^ holds, see Exer¬ 
cise 3. 

We are now ready to prove the triangle inequality for the L p norm. 

Theorem 1.2 (Minkowski) Ifl<p<oo and o E L p , then f + o G 
U and \\f + 9\\ LP <\\f\\ L p + \\9\\LP. 

Proof. The case p = 1 is obtained by integrating \f(x) + g(^)\ ^ 
\f(x)\ + |^(x)|. When p > 1, we may begin by verifying that f g ^ L p , 
when both / and g belong to L p . Indeed, 

1/( 工 ) + g(^)\ p < 2 p (|/( 工 )| p + I〆 工 ) l p )， 

as can be seen by considering separately the cases \f(x)\ < \g(x)\ and 
\g(x)\ < \ f(x)\. Next we note that 

1/( 工 ) + g(^)\ p < l/WI 1/(4 + + \g(^)\ 1/(4 + 



1. I/P spaces 


5 


jf q denotes the conjugate exponent of p, then (p — l)q = p, so we see 
that (/ + g) p l belongs to L q 、 and therefore Holder^ inequality applied 
to the two terms on the right-hand side of the above inequality gives 

(3) ||/ + ^IIlp - II/IIlp||(/ + ^) p_1 ||l i ? + ||^||lp||(/ + 

However, using once again (p — l)q = p, we get 

11(/ + #-Ilf 11/+ 此 

From (3 )， since p — p/q = 1, and because we may suppose that ||/ + 
g\\ LP > 0, we find 

11/ + < II/IIlp + \\g\\LPj 

so the proof is finished. 

1.2 Completeness of L p 

The triangle inequality makes L p into a metric space with distance 
d(f,g) = 11/ - q\\lp- The basic analytic fact is that L p is complete 
in the sense that every Cauchy sequence in the norm || - \\lp converges to 
an element in L p . 

Taking limits is a necessity in many problems, and the L p spaces would 
be of little use if they were not complete. Fortunately, like L 1 and L 2 , 
the general L p space does satisfy this desirable property. 

Theorem 1.3 The space L P (X, /i) is complete in the norm || - \\lp. 

Proof. The argument is essentially the same as for L 1 (or L 2 ); see 
Section 2, Chapter 2 and Section 1, Chapter 4 in Book III. Let {f n }^Li 
be a Cauchy sequence in L p , and consider a subsequence {/ nfc }^? =1 of 
{fn} with the following property ||/ nfc+1 - fn k \\Lp < 2 _/c for all fc > 1. 
We now consider the series whose convergence will be seen below 

OO 

/( 工卜 /〜 ㈤ + 幻/叫 +1 ㈤ -/ n “工 )） 

k=l 


9 ⑻ = l/n»| +ED) -/n fc ㈣ ， 

k=l 


and 
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Letting K tend to infinity，and applying the monotone convergence theo¬ 
rem proves that f g p < oo, and therefore the series defining g, and hence 
the series defining / converges almost everywhere, and f E L p . 

We now show that / is the desired limit of the sequence {/ n }. Since 
(by construction of the telescopic series) the (K — l) th partial sum of 
this series is precisely f nK , we find that 

fn K ( x ) f( x ) a.e. x. 

To prove that f nK — / in L p as well, we first observe that 

\f(x) - S K (f)(xW< [2max(|/(x)|, \S K (f)(x)\)] p 

<2P\f(x)\^2P\S K (f)(x)\P 

<2^\g(x)\^ 

for all K• Then, we may apply the dominated convergence theorem to 
get \\fn K — /||lp — 0 as tends to infinity. 

Finally, the last step of the proof consists of recalling that {f n } is 
Cauchy. Given 6 > 0, there exists N so that for all n,m> N we have 
ll/n - fm\\Lp < e/2. If n K is chosen so that n K > N, and \\f UK - f\\ LP < 
e/2, then the triangle inequality implies 

||/n — /||lp < ||/n — fn K \\LP + \\fn K — /||lp < ^ 

whenever n > N. This concludes the proof of the theorem. 


and the corresponding partial sums 


K 


SK{f){x) = fnA X ) ^^2(fn k + l { X ) - fn k (^)) 


k: 


and 


K 


Sk(9)(x) = 1/mWI +5^|/〜 + 1 ㈤ ~ fn k { X )\ 


k: 


The triangle inequality for L p implies 


p 

L 


/n 


ll/n2- fc 


p 

L 


1J 

ll/n 

VI 


p 

L 


IX 

ll/n 

VI 


p 

L 


5k 



2. The case p = 


7 


1.3 Further remarks 

We begin by looking at some possible inclusion relations between the 
various L p spaces. The matter is simple if the underlying space has 
finite measure. 

Proposition 1.4 If X has finite positive measure，and po < pi, then 
IJP l (X) C L Po (X) and 

"’" LP0 - m (x)Vpi ^ lpi - 

We may assume that pi > po. Suppose / € L Pl , and set F = |/| Po , 
G = 1， p 二 Pi/po > 1? and 1/p+l/g 二 l，in Holder^ inequality applied 
to F and G. This yields 



In particular, we find that ||/||lpo < oo. Moreover, by taking the pg h root 
of both sides of the above equation, we find that the inequality in the 
proposition holds. 

However, as is easily seen, such inclusion does not hold when X has 
infinite measure. (See Exercise 1). Yet, in an interesting special case the 
opposite inclusion does hold. 


Proposition 1.5 If X = Z is equipped with counting measure，then the 
reverse inclusion holds，namely L Po (Z) C L Pl (Z) if po < p\^ Moreover ， 

II/IIlpi < "/"LPO- 

Indeed，if / = { f ( n )} neZ , then ^ \ f( n )\ Po = II/IIl°po ^nd sup n |/(n)| < 
ll/ll lpo • However 


㈨ i pi = Ei ， ㈨™ n )i pi — po 

< (sup|/(n)|) Pl_Po ||/||^° PO 

n 

n. 

Thus ll/ll 

Lpi ^ II/IIlpo. 


2 The case p = oo 

Finally, we also consider the limiting case p = oo. The space L°° will 
be defined as all functions that are “essentially bounded” in the follow¬ 
ing sense. We take the space L°°(X, J 7 ,/i) to consist of all (equivalence 
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classes of) measurable functions on X, so that there exists a positive 
number 0 < M < 00 , with 

\f(x)\ < M a.e. x. 

Then, we define WfWL^ix^^) to be the infimum of all possible values M 
satisfying the above inequality. The quantity H/H^oo is sometimes called 

the essential-supremum of /. 

We note that with this definition, we have \f(x)\ < ||/|| L oc for a.e. x. 
Indeed, if E = {x : \f(x)\ > ||/|| L oc}, and E n = {x : \f(x)\ > ||/||l~ + 
1/n }， then we have fi(E n ) = 0, and E = [J E n , hence f^(E) — 0. 

Theorem 2.1 The vector space L°° equipped with || - H^ 00 is a complete 
vector space. 

This assertion is easy to verify and is left to the reader. Moreover, 
Holder^ inequality continues to hold for values of p and q in the larger 
range I < p,q < 00 , once we take p = 1 and q = 00 as conjugate expo¬ 
nents, as we mentioned before. 

The fact that L°° is a limiting case of L p when p tends to 00 can be 
understood as follows. 


Proposition 2.2 Suppose f £ L°° is supported on a set of finite mea¬ 
sure. Then f G L p for all p < oo f and 

||/||lp —• II/Hl 00 as p ^ 00. 


Proof. Let E be a, measurable subset of X with fi(E) < 00 ， and so 
that / vanishes in the complement of E. If fi(E) ■- = 0, then ||/||l-= 
II/IIlp=0 and there is nothing to prove. Otherwise 




< II/IIl # ⑹ 1 /p . 


Since ^ 1 as p —^ cx), we find that limsup p — ㈤ ||/||lp < 

On the other hand, given 6 > 0, we have 


/i({x : \f(x)\ > II/Hl 00 — ^}) > ^ for some J > 0, 


hence 

[\f\ p dfi>S(\\f\\L^-er. 

Jx 

Therefore liminf p _oo ||/||lp > ||/||l°° — e ， and since e is arbitrary, we 
have liminfp— 00 ||/||lp > I|/IIl°°. Hence the limit lim p — ⑴ ||/|| LP exists, 
and equals ||/||l°°. 
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3. Banach spaces 

3 Banach spaces 

We introduce here a general notion which encompasses the L p spaces as 
specific examples. 

First, a normed vector space consists of an underlying vector space V 
over a field of scalars (the real or complex numbers), together with a 
norm || • || : F —> M + that satisfies: 

• | 卜 || = 0 if and only if v = 0. 

• = |a| | 卜 ||， whenever a is a scalar and v GV. 

• ||i; + w\\ < ||i;|| + || 川 || for all v^w ^ V. 

The space V is said to be complete if whenever {vn} is a Cauchy 
sequence in V, that is, ||r n — v m \\ — 0 as n，m — oo, then there exists a 
v eV such that \\v n — r|| — 0 as n — oo. 

A complete normed vector space is called a Banach space. Here 
again, we stress the importance of the fact that Cauchy sequences con¬ 
verge to a limit in the space itself，hence the space is “closed” under 
limiting operations. 

3.1 Examples 

The real numbers R with the usual absolute value form an initial example 
of a Banach space. Other easy examples are with the Euclidean norm, 
and more generally a Hilbert space with its norm given in terms of its 
inner product. 

Several further relevant examples are as follows: 

Example 1. The family of LP spaces with 1 < p < oo which we have just 
introduced are also important examples of Banach spaces (Theorem 1.3 
and Theorem 2.1). Incidentally, L 2 is the only Hilbert space in the 
family L p , where 1 < p < oo (Exercise 25) and this in part accounts for 
the special flavor of the analysis carried out in L 2 as opposed to L 1 or 
more generally L p for p ^ 2. 

Finally, observe that since the triangle inequality fails in general when 
0 < p < 1, || • ||lp is not a norm on L p for this range of p, hence it is not 
a Banach space. 

Example 2. Another example of a Banach space is C([0,1]), or more 
generally C(X) with X a compact set in a metric space, as will be de¬ 
fined in Section 7. By definition, C(X) is the vector space of continuous 
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functions on X equipped with the sup-norm ||/|| = sup rreX |/(x)|. Com¬ 
pleteness is guaranteed by the fact that the uniform limit of a sequence 
of continuous functions is also continuous. 


Example 3. Two further examples are important in various applications. 
The first is the space A a (R) of all bounded functions on M which satisfy 

a Holder (or Lipschitz) condition of exponent a with 0 < a < 1, 

that is, 


sup 
tl 关 t2 


~h-t 2 \ a 


< OO. 


Observe that / is then necessarily continuous; also the only interesting 
case is when a < 1, since a function which satisfies a Holder condition of 
exponent a with a > 1 is constant. 2 

More generally, this space can be defined on it consists of contin¬ 
uous functions / equipped with the norm 


ll/ll 八 ，） =sup \f(x)\ +sup 

rrGM d x 古 y 


\f{^) - f{y) 


x-y 


OL 


With this norm, A a (R d ) is a Banach space (see also Exercise 29). 


Example 4. A function / E L p (R d ) is said to have weak derivatives 
in LP up to order fc, if for every multi-index a = (ai,..., a^) with |a = 
Qfi + • • • + Qfd < fc, there is a G L p with 

(4) f g a (x)ip(x) dx = (~1) M [ f(x)d^(f(x) dx 

JRd jRd 


for all smooth functions (f that have compact support in R d . 


use the multi-index notation 


% 


d_ 

dx 


OL 




Here, we 


Clearly, the functions g a (when they exist) are unique, and we also write 
dxf = 9a- This definition arises from the relationship (4) which holds 
whenever / is itself smooth, and g equals the usual derivative as 
follows from an integration by parts (see also Section 3.1, Chapter 5 in 
Book III). 


2 We have already encountered this space in Book I, Chapter 2 and Book III, Chapter 7. 
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The space L^(R d ) is the subspace of L p (M d ) of all functions that have 
weak derivatives up to order k. (The concept of weak derivatives will 
reappear in Chapter 3 in the setting of derivatives in the sense of distri¬ 
butions.) This space is usually referred to as a Sobolev space. A norm 
that turns L^(R d ) into a Banach space is 

ll/llz^(R d ) = W^x f\\LP(R d ) - 

|a|<fc 

Example 5. In the case p = 2， we note in the above example that an 
L 2 function / belongs to L^(R d ) if and only if (1 + |€| 2 ) fc / 2 /(0 belongs 
to L 2 , and that ||(1 + |《| 2 ) fc / 2 / ⑹ is a Hilbert space norm equivalent 

to II/IIl2(m^)- 

Therefore, if k is any positive number, it is natural to define L\ as 
those functions / in L 2 for which (1 + |^| 2 ) fc ^ 2 /(0 belongs to L 2 , and we 
can equip L\ with the norm ||/|| L 2 (R d) = ||(1 + \^\ 2 ) k/2 f{Oh^ 

3.2 Linear functionals and the dual of a Banach space 

For the sake of simplicity, we restrict ourselves in this and the following 
two sections to Banach spaces over K; the reader will find in Section 6 
the slight modifications necessary to extend the results to Banach spaces 
over C. 

Suppose that S is a Banach space over R equipped with a norm || - ||. A 
linear functional is a linear mapping i from S to IR，that is ， 彳 ：S — R ， 
which satisfies 

£(af + 0g) = a£(f) + /3£(g), for all a,/? G R, and f，g 6 B. 

A linear functional i is continuous if given e > 0 there exists 5 > 0 so 
that |^(/) — ^(^)| < e whenever ||/ — ^|| ^ S. Also we say that a linear 
functional is bounded if there is M > 0 with \£(f)\ < M||/|| for all / G 
汉 The linearity of £ shows that these two notions are in fact equivalent. 

Proposition 3.1 A linear functional on a Banach space is continuous, 
if a ^d only if it is bounded. 

Proof. The key is to observe that £ is continuous if and only if £ is 
continuous at the origin. 

Indeed, if £ is continuous, we choose e 二 1 and ^ = 0 in the above 
definition so that < 1 whenever ||/|| < 5 , for some 5 > 0 . Hence, 
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given any non-zero ft, an element of we see that 5/i/||/i|| has norm equal 
to 5, and hence |£(5/i/||/i||)| < 1. Thus \i(h)\ < M||/i|| with M = 1/6. 

Conversely, if i is bounded it is clearly continuous at the origin, hence 
continuous. 

The significance of continuous linear functionals in terms of closed 
hyperplanes in S is a noteworthy geometric point to which we return 
later on. Now we take up analytic aspects of linear functionals. 

The set of all continuous linear functionals over 13 is a, vector space 
since we may add linear functionals and multiply them by scalars: 

㈧ +4)(/) = W)+W) and (ae)(f) = 

This vector space may be equipped with a norm as follows. The norm 
|^|| of a continuous linear functional i is the infimum of all values M for 
which \£(f)\ < M||/|| for all f £ B. From this definition and the linearity 
of £ it is clear that 

II = sup 1^(/) I = sup \£(f) \ = sup^Y^. 

Il/ll<i ll/ll=i /^o 11/II 

The vector space of all continuous linear functionals on B equipped 
with || - || is called the dual space of and is denoted by B*. 

Theorem 3.2 The vector space B* is a Banach space. 

Proof. It is clear that || • || defines a norm, so we only check that B* is 
complete. Suppose that {£ n } is a Cauchy sequence in B*. Then, for each 
f 6 B, the sequence (£ n (/)} is Cauchy, hence converges to a limit, which 
we denote by £(f). Clearly, the mapping £ : f £(f) is linear. If M is 
so that ll^ll < M for all n, we see that 

\m <\(i- 4)(/)| + I4(/)| <1(^- 4)(/)| + Mll/ll, 

so that in the limit as n —^ oo, we find < M||/|| for all f G B. 

Thus £ is bounded. Finally, we must show that £ n converges to £ in B*. 
Given e > 0 choose N so that \\£ n ~ ^m\\ < e /2 for all n，m > N ， Then, 
if n > TV, we see that for all m > TV and any / 

1(^ — ^n)(/)| ^ 1(^ — ^m)(/)| + |(^m — ^n)(/)| < | (^ — ^m)(/)| + ~||/| • 

We can also choose m so large (and dependent on /) so that we also have 
|(^ — im){f)\ ^ e 11/1|/2. In the end, we find that for n > N, 
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This proves that \\£ - £ n \\ 0, as desired. 

In general, given a Banach space it is interesting and very useful to 
be able to describe its dual B*. This problem has an essentially complete 
answer in the case of the L p spaces introduced before. 


4 The dual space of LP when 1 < p < oc 

Suppose that 1 <p < oo and g is the conjugate exponent of p, that is, 
l/p+ 1 /g 二 1. The key observation to make is the following: Holder^ 
inequality shows that every function g £ L q gives rise to a bounded linear 
functional on L p by 

(5) i(f) = [ f(x)g(x) dfi(x), 

J x 

and that ||^|| < Therefore, if we associate g to i above, then we 

find that L q C (L p )* when 1 < p < oo. The main result in this section 
is to prove that when 1 < p < cx), every linear functional on L 9 is of 
the form (5) for some g £ L q . This implies that (L p )* = L q whenever 
1 < p < cx). We remark that this result is in general not true when p = cx )； 
the dual of L°° contains L 1 , but it is larger. (See the end of Section 5.3 
below.) 

Theorem 4.1 Suppose 1 < p < oo, and 1/p + l/q = 1. Then, with B = 
LP we have 

= L q , 

in the following sense: For every bounded linear functional i on L p there 
is a unique g E ： L q so that 

<(/)=/ /( 工 ) 分 ( 工 ) dfi(x), for all f e L p . 

J x 

Moreover, \\i\\ B ^ = \\g\\ L ^ 

This theorem justifies the terminology whereby q is usually called the 
dual exponent of p. 

The proof of the theorem is based on two ideas. The first, as already 
seen，is H61der’s inequality; to which a converse is also needed. The 
second is the fact that a linear functional £ on L p , 1 < p < cx), leads nat¬ 
urally to a (signed) measure v. Because of the continuity of £ the measure 
v is absolutely continuous with respect to the underlying measure /i, and 
our desired function g is then the density function of v in terms of /i. 

We begin with: 
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Lemma 4.2 Suppose I < p^q < oo, are conjugate exponents. 

(i) IfgeL q , then \\g\\ Lq = sup 

II/IIlp 

(ii) Suppose g is integrable on all sets of finite measure，and 



sup 

ll/llLP < 1 
f simple 

Then g £ L q ， and || 分 = M. 


J fg = M <oo. 


For the proof of the lemma，we recall the signum of a real number 
defined by 


{ 1 if x > 0 
—1 if x < 0 
0 if x = 0. 

Proof. We start with (i). If 分 = 0， there is nothing to prove, so 
we may assume that g is not 0 a.e., and hence WgW^ 7^ 0. By Holder^ 
inequality, we have that 


II^IIl^ > sup fg 
II/IIlp<i J 


To prove the reverse inequality we consider several cases. 


• First, if g = 1 and p = oo, we may take f(x) = sign^(x). Then, we 
have ||/|| L oc = 1 ， and clearly, f fg = \\g\\L^ 

• If 1 < < oo, then we set f(x) = |^(x)| 9 ~ 1 sign^(x)/||^||^7 1 - We 

observe that ||/||^ P = / d/i/||^||^ _1) = 1 since p(q - 

1 ) = q, and that f fg = \\g\\L^ 


• Finally, if g 二 oo and p = 1, let 6 > 0, and E a set of finite posi¬ 
tive measure, where \g(x)\ > IIMIl 00 —己 (Such a set exists by the 
definition of and the fact that the measure /i is cr-finite.) 

Then, if we take f(x) = 狀 ㈤ sign 〆：!：)/"(£)，where \e denotes 
the characteristic function of the set E, we see that H/H^i = 1, and 
also 





1^1 ^ WqWoo - 


E 
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This completes the proof of part (i). 

To prove (ii) we recall 3 that we can find a sequence {g n } of simple 
functions so that \g n {x)\ < \g(x)\ while g n (x) — g(x) for each x. When 
p> 1 {so q< oo)，we take f n (x) = \g n (x)\ q ~ l s\gng{x)/WgnW^ 1 . As be¬ 
fore, Wfnhp = 1- However 



J\9n(x)\ q 

\\9n\\ q L ~ q l 




and this does not exceed M. By Fatou’s lemma it follows that f \g\ q < 
M g ， so g G L q with WgW^ ^ The direction ||"||z^ > M is of course 
implied by Holder^ inequality. 

When p = l the argument is parallel with the above but simpler. Here 
we take f n [x) = (sign^(x))xE n (^), where E n is an increasing sequence 
of sets of finite measure whose union is X. The details may be left to 
the reader. 


With the lemma established we turn to the proof of the theorem. It 
is simpler to consider first the case when the underlying space has finite 
measure. In this case, with £ the given functional on L v 、 we can then 
define a set function v by 


^i E ) = 忍 (Xe )， 

where E is any measurable set. This definition makes sense because xe is 
now automatically in L p since the space has finite measure. We observe 
that 

⑹ HE)\ < , 

where c is the norm of the linear functional, taking into account the fact 
that Hx^IIlp =("( 五 )） 1/p . 

Now the linearity of £ clearly implies that v is finitely-additive. More¬ 
over, if {E n } is a countable collection of disjoint measurable sets，and we 
P ut E = LC=i E n ， E% = U^N+i then obviously 

N 

Xe = Xe* n + 

n=l 

Thus u{E) = -f [ 匕 i K 五 n). However y{E* N ) — 0 ， as TV — oo, 

because of (6) and the assumption p < oo. This shows that v is countably 


3 See for instance Section 2 in Chapter 6 of Book III. 
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additive and, moreover, (6) also shows us that v is absolutely continuous 
with respect to /i. 

We can now invoke the key result about absolutely continuous mea¬ 
sures, the Lebesgue-Radon-Nykodim theorem. (See for example Theo¬ 
rem 4.3, Chapter 6 in Book III.) It guarantees the existence of an in- 
tegrable function g so that i^(E) = J E gdfi for every measurable set E. 
Thus we have £(Xe) 二 f Xe9 dfi. The representation £(f) — f fg d[i then 
extends immediately to simple functions /, and by a passage to the limit ， 
to all / G L p since the simple functions are dense in L p , 1 < p < oo. (See 
Exercise 6.) Also by Lemma 4.2, we see that ||^||l«7 = ||^||. 

To pass from the situation where the measure of X is finite to the 
general case, we use an increasing sequence {E n } of sets of finite measure 
that exhaust X ， that is, X = (J3=i E n . According to what we have just 
proved, for each n there is an integrable function g n on E n (which we 
can set to be zero in E^) so that 

⑺ ^(/) = J f9n dfJL 

whenever / is supported in E n and / G L p . Moreover by conclusion (ii) 
of the lemma H^nllL*? ^ w. 

Now it is easy to see because of (7) that g n = g m a.e. on E m , whenever 
n > m. Thus lim^^oo g n {x) = g(x) exists for almost every x, and by 
Fatou’s lemma, \\g\\L<i ^ ||^||. As a result we have that £(f) = f fg d[i for 
each / G L p supported in E n , and then by a simple limiting argument, for 
all / G L p . The fact that ||^|| < ||^||l<3, is already contained in Holder^ 
inequality, and therefore the proof of the theorem is complete. 


5 More about linear functionals 

First we turn to the study of certain geometric aspects of linear function¬ 
als in terms of the hyperplanes that they define. This will also involve 
understanding some elementary ideas about convexity. 

5.1 Separation of convex sets 

Although our ultimate focus will be on Banach spaces, we begin by con¬ 
sidering an arbitrary vector space V over the reals. In this general setting 
we can define the following notions. 

First, a proper hyperplane is a linear subspace of V that arises as 
the zero set of a (non-zero) linear functional on V. Alternatively, it is 
a linear subspace of V so that it, together with any vector not in V^ 
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spans V. Related to this notion is that of an affine hyperplane (which 
for brevity we will always refer to as a hyperplane) defined to be a 
translate of a proper hyperplane by a vector in V. To put it another 
way: H is a hyperplane if there is a non-zero linear functional £, and a 
real number a, so that 

H = {v G V : £(v) = a}. 

Another relevant notion is that of a convex set. The subset K C V is said 
to be convex if whenever 如 and V\ are both in K then the straight-line 
segment joining them 

(8) v(t) = (1 - t)vo + ㈣ ， 0 < t < 1 

also lies entirely in K• 

A key heuristic idea underlying our considerations can be enunciated 
as the following general principle: 

If K is a convex set and vq 丰 K ， then K and vq can be sep¬ 
arated by a hyperplane. 

This principle is illustrated in Figure 1. 


£{v) — a 

Figure 1 . Separation of a convex set and a point by a hyperplane 


The sense in which this is meant is that there is a non-zero linear 
functional £ and a real number a, so that 

£(vo) > a, while £(v) < a ii v £ K. 

To give an idea of what is behind this principle we show why it holds in 
a nice special case. (See also Section 5.2.) 
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Proposition 5.1 The assertion above is valid if V = and K is con¬ 
vex and open. 

Proof. Since we may assume that K is non-empty, we can also 
suppose that (after a possible translation of K and 外 ） we have 0 G K. 
The key construct used will be that of the Minkowski gauge function p 
associated to K ， which measures (the inverse of) how far we need to go, 
starting from 0 in the direction of a vector v, to reach the exterior of K. 
The precise definition of p is as follows: 

p(v) — inf {r : v/r G K}. 

Observe that since we have assumed that the origin is an interior point 
of K ， for each v there is an r > 0, so that v/r G K. Hence p(v) is 
well-defined. 

Figure 2 below gives an example of a gauge function in the special case 
where = R and K — (a, b), an open interval that contains the origin. 



Figure 2. The gauge function of the interval (a, b) in R 


We note, for example, that if V is normed and K is the unit ball 
{IMI < 1}，then p(v) = ||i;||. 

In general, the non-negative function p completely characterizes K in 
that 


⑼ 


p(v) <1 if and only if v G K. 


Moreover p has an important sub-linear property: 


( 10 ) 


p[av) = ap{v)^ if a > 0, and v V. 

p(vi + v 2 ) < p(v\) + p{v 2 ), if Vi and v 2 G V. 
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In fact ，\i v K then v/(l — e) G K for some 6 > 0, since K is open, 
which gives that p(v) < 1. Conversely if p(v) < 1, then = (1 — e)v f ^ for 
some 0 < 6 < 1, and v f £ K. Then since = (1 — e)v f + 6-0 this shows 
v e K, because 0 e K and K is convex. 

To verify (10) we merely note that (w + ^)/(^1 + r 2 ) belongs to K, 
if both v\/ti and i; 2 /r 2 belong to K, in view of property (8) defining the 
convexity of K with t = r 2 /(ri + r 2 ) and 1-t = ri/(ri + r 2 ). 

Now our proposition will be proved once we find a linear functional 彳， 
so that 

(11) i(v 0 ) = 1, and £(v) < p(v), v e R d . 

This is because £(v) < 1, for all G by (9). We shall construct ^ in a 
step-by-step manner. 

First，such an £ is already determined in the one-dimensional sub¬ 
space Vo spanned by i ； o, Vo = {Ri ； o}, since £(bvo) = b£(vo) = b, when 
G R, and this is consistent with (11). Indeed, if b > 0 then p(bvo)= 
bp(vo) > b£(vo) = £(bvo) by (10) and (9)，while (11) is immediate when 
b<0. 

The next step is to choose any vector V\ linearly independent from vo 
and extend £ to the subspace V\ spanned by Vo and Thus we can 
make a choice for the value of on £(vi)^ so as to satisfy (11) if 

a£(vi) + b = £(av\ + bvo) < p(av\ + bvo), for all a, 6 G R. 

Setting a = 1 and bvo = w yields 

£{v\) + £(w) < p(vi + w) for all w £ Vo, 

while setting a — —1 implies 

~^i) + i(w f ) < p(—v\ + w ’）， for all w f Vo, 

Altogether then it is required that for all w f G Vo 

(12) —p(— + w f ) + £(w f ) < i(v\) < p(vi -\-w) — £(w). 

Notice that there is a number that lies between the two extremes of the 
above inequality. This is a consequence of the fact that —p(—vi + w f ) + 
never exceeds p(v\ + 川 ) 一 £(w), which itself follows from the fact 
that £(w) + £(w f ) < p{w + w f ) < p( —v\ + w r ) -\-p(v\ + w), by (11) on Vo 
and the sub-linearity of p. So a choice of £(v\) can be made that is 
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consistent with (12) and this allows one to extend 彳 to Vi. In the same 
way we can proceed inductively to extend i to all of 

The argument just given here in this special context will now be car¬ 
ried over in a general setting to give us an important theorem about 
constructing linear functionals. 


5.2 The Hahn-Banach Theorem 

We return to the general situation where we deal with an arbitrary vector 
space V over the reals. We assume that with V we are given a real-valued 
function p on V that satisfies the sub-linear property (10). However, as 
opposed to the example of the gauge function considered above，which 
by its nature is non-negative, here we do not assume that p has this 
property. In fact，certain p’s which may take on negative values are 
needed in some of our applications later. 

Theorem 5.2 Suppose Vo is a linear subspace of V, and that we are 
given a linear functional £o on Vo that satisfies 

io(v) < p(v), for all v e Vq. 

Then can be extended to a linear functional i on V that satisfies 

£(v) < p(v)^ for all v . 

Proof. Suppose Vo ^ V, and pick Vi a vector not in Vo. We will first 
extend £o to the subspace V\ spanned by Vo and as we did before. 
We can do this by defining a putative extension i\ of 彳 0 ， defined on V\ 
by £i(av\ - w) = a£\ (i ； i) + <o—)，whenever it? € Vo and a € M, if 
is chosen so that 

£i(v) < p(v)^ for all v 
However, exactly as above, this happens when 

~p(~vi + w f ) + £o(w f ) < h(vi) < p(vi - \-w) - £o(w) 
for all w f € Vo- 

The right-hand side exceeds the left-hand side because of £o(w f ) + 
彳 o( 川)乞 v{ w/ + w ) and the sub-linearity of p. Thus an appropriate choice 
of £i(v\) is possible, giving the desired extension of £0 from Vo to V\. 

We can think of the extension we have constructed as the key step in 
an inductive procedure. This induction, which in general is necessarily 



5. More about linear functionals 


21 


trans-finite, proceeds as follows. We well-order all vectors in V that do 
not belong to Vo, and denote this ordering by <. Among these vectors we 
call a vector v “extendable” if the linear functional £q has an extension 
of the kind desired to the subspace spanned by Vo, and all vectors 
< v. What we want to prove is in effect that all vectors not in Vo are 
extendable. Assume the contrary, then because of the well-ordering we 
can find the smallest v\ that is not extendable. Now if Vq is the space 
spanned by Vo and all the vectors < then by assumption £q extends 
to Vq. The previous step, with Vq in place of Vo allows us then to extend 
£ 0 to the subspace spanned by Vq and V\ , reaching a contradiction. This 
proves the theorem. 

5.3 Some consequences 

The Hahn-Banach theorem has several direct consequences for Banach 
spaces. Here denotes the dual of the Banach space B as defined in 
Section 3.2, that is, the space of continuous linear functionals on B. 

Proposition 5.3 Suppose is a given element of B with ||/o|| = M. 
Then there exists a continuous linear functional £ on B so that i(fo) = M 
and = 1. 

Proof. Define £o on the one-dimensional subspace {a/o}a€M by 
^o(^/o) = aM ， for each a G R. Note that if we set p(f) = ||/|| for every 
/ G the function p satisfies the basic sub-linear property (10). We also 
observe that 

|4(^/o)| = \oi\M = |a|||/ 0 || = p{af 0 ), 

so £o(f) < p(f) on this subspace. By the extension theorem £q extends 
to an £ defined on B with £(f) < p(f) = ||/||, for all f G B. Since this 
inequality also holds for —/ in place of / we get \i{f)\ < ||/||, and thus 
W^Wb* < 1- The fact that > 1 is implied by the defining property 

^(/o) = ||/o||, thereby proving the proposition. 

Another application is to the duality of linear transformations. Sup¬ 
pose B\ and ^2 are a pair of Banach spaces, and T is a bounded lin¬ 
ear transformation from B\ to B 2 - By this we mean that T maps B\ 
to S 2 ; it satisfies T(afi + 0f 2 ) = aT(fi) + f3T(f 2 ) whenever / 1 , / 2 € S 
and a and (3 are real numbers; and that it has a bound M so that 
11^(/)||5 2 ^ for all f G B\. The least M for which this inequal¬ 

ity holds is called the norm of T and is denoted by ||T||. 

Often a linear transformation is initially given on a dense subspace. In 
this connection, the following proposition is very useful. 
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Proposition 5.4 Let B\, B 2 be a pair of Banach spaces and S C 
a dense linear subspace of B \. Suppose To is a linear transformation 
from S to B 2 that satisfies ||7(3(/)||5 2 < for all / G <S. Then To 

has a unique extension T to all ofB\ so that \\T(f)\\^ 2 < for all 

Proof. If / G Si, let {/ n } be a sequence in S which converges to 
f. Then since ||T 0 (/ n ) - T 0 {f m )\\ B2 < M\\f n - it follows that 

{To(/ n )} is a Cauchy sequence in S 2 , and hence converges to a limit, 
which we define to be T(/). Note that the definition of T(f) is indepen¬ 
dent of the chosen sequence {/ n }, and that the resulting transformation 
T has all the required properties. 

We now discuss duality of linear transformations. Whenever we have 
a linear transformation T from a Banach space B\ to another Banach 
space ^ 2 ， it induces a dual transformation, T* of ^ to that can 
be defined as follows. 

Suppose £2 ^ ^ 2 ^ ( a continuous linear functional on ^ 2 )，then — 
了 *(< 2 ) £ is defined by = ^(^(/l)), whenever /1 € B\. More 

succinctly 

(13) r* ⑹⑹ = 4 (r ⑹)， 

Theorem 5.5 The operator T* defined by (13) is a bounded linear trans¬ 
formation from B 2 to . Its norm \\T* || satisfies ||T|| = ||T*||. 

Proof. First, if H/iH^! < 1, we have that 

Ki(/i)i = MT(h))\ < m\ m/on^ < 11^11 ni. 

Thus taking the supremum over all /1 G B\ with H/iH^! < 1, we see that 
the mapping £ 2 ^ ^*(^ 2 ) = h has norm < ||T||. 

To prove the reverse inequality we can find for any e > 0 an /1 € Si 
with \\/i\\b 1 = 1 and ||T(/i)||b 2 > ||T|| - e. Next, with / 2 = T(/i) € B 2 , 
by Proposition 5.3 (with B = B 2 ) there is an £ 2 in ^ so that = 1 

but £ 2 (/ 2 ) > ||T|| - e. Thus by (13) one has T*(£ 2 ){fi) > ||T|| - e, and 
since ||/i|| 5 i = 1, we conclude \\T*(£ 2 )\\i3i > ||^|| - This gives ||T*|| > 
||T|| — e for any e > 0, which proves the theorem. 

A further quick application of the Hahn-Banach theorem is the obser¬ 
vation that in general L l is not the dual of L°° (as opposed to the case 
1 < p < 00 considered in Theorem 4.1). 
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Let us first recall that whenever g G L 1 , the linear functional / £(f) 

given by 

(14) i(f) = J fgdix 

is bounded on L°°, and its norm ||<||(loo)* is || 分 ||p. In this way L l can be 
viewed as a subspace of (L°°)*，with the L 1 norm of g being identical with 
its norm as a linear functional. One can, however, produce a continuous 
linear functional of L°° not of this form. For simplicity we do this when 
the underlying space is R with Lebesgue measure. 

We let C denote the subspace of L°°(M) consisting of continuous 
bounded functions on R. Define the linear function £o on C (the “Dirac 
delta”）by 

4 (/) = /( 0 )， fee. 

Clearly |4(/)| < ||/||l°° ，/ € C. Thus by the extension theorem, with 
p(f) = ||/||l°° ，we see that there is a linear functional £ on L°°, extend¬ 
ing £o, that satisfies \£(f)\ < |1 /||l°° ，for all / G L°°. 

Suppose for a moment that £ were of the form (14) for some g G L l . 
Since £(f) = £o(f) = 0 whenever / is a continuous trapezoidal function 
that excludes the origin, we would have J fgdx = 0 for such functions /; 
by a simple limiting argument this gives Jj g dx = 0 for all intervals ex¬ 
cluding the origin, and from there for all intervals I. Hence the indefi¬ 
nite integrals G(y) = J Q y g(x) dx vanish, and therefore G ; = g = 0 by the 
differentiation theorem. 4 This gives a contradiction, hence the linear 
functional £ is not representable as (14). 

5.4 The problem of measure 

We now consider an application of the Hahn-Banach theorem of a dif¬ 
ferent kind. We present a rather stunning assertion, answering a basic 
question of the “problem of measure.” The result states that there is a 
finitely-additive 5 measure defined on all subsets of that agrees with 
Lebesgue measure on the measurable sets, and is translation invariant. 
We formulate the theorem in one dimension. 

Theorem 5.6 There is an extended-valued non-negative function rh, de¬ 
fined on all subsets o/M. with the following properties: 

(i) rh{E\ U 五 2 ) = 兩 ( 丑 1 ) +rh ( 五 2 ) whenever E\ and E 〗 are disjoint 
subsets o/R. 

4 See for instance Theorem 3.11, in Chapter 3 of Book III. 

5 The qualifier “finitely-additive” is crucial. 
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(ii) m(E) = m(E) if E is a measurable set and m denotes the Lebesgue 
measure. 

(iii) m(E - h) = m(E) for every set E and real number h. 

From (i) we see that m is finitely additive; however it cannot be countably 
additive as the proof of the existence of non-measurable sets shows. (See 
Section 3, Chapter 1 in Book III.) 

This theorem is a consequence of another result of this kind, dealing 
with an extension of the Lebesgue integral. Here the setting is the circle 
M/Z, instead of R, with the former realized as (0,1]. Thus functions on 
M/Z can be thought of as functions on (0,1]，extended to R by periodicity 
with period 1. In the same way, translations on R induce corresponding 
translations on R/Z. The assertion now is the existence of a generalized 
integral (the “Banach integral”）defined on all bounded functions on the 
circle. 

Theorem 5.7 There is a linear functional f 1(f) defined on all 
bounded functions f on R/Z so that: 

(a) 1(f) > 0 ; if f(x) > 0 for all x. 

(b) I(afi + (3f 2 ) = a/(/i) + PI ⑹ for all a and (3 real. 

(c) 1(f)= : Jq 1 f(x) dx f whenever f is measurable. 

(d) I(f h ) = 1(f)，for allheR where f h (x) = f(x - h). 


The right-hand side of (c) denotes the usual Lebesgue integral. 

Proof. The idea is to consider the vector space V of all (real-valued) 
bounded functions on R/Z, with Vo the subspace of those functions that 
are measurable. We let Io denote the linear functional given by the 
Lebesgue integral, Io(f ) : =f(x) dx for / € Vo- The key is to find the 
appropriate sub-linear p defined on V so that 

Io(f) < P (/)， for all / € Vo- 

Banach’s ingenious definition of p is as follows: We let A = {ai,... ， a^} 
denote an arbitrary collection of N real numbers, with *(A ) 二 N denot¬ 
ing its cardinality. Given A, we define M^(/) to be the real number 

/ x ^ 

M a (/) 二 sup ; /(x + 〜） 
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and set 

Pif) = inf{M^(/)}, 

where the infimum is taken over all finite collections A. 

It is clear that p(f) is well-defined, since / is assumed to be bounded; 
also p(cf) = cp(f) if c > 0. To prove p(/i + / 2 ) < p(/i) + p(/ 2 ), we find 
for each e, finite collections A and B so that 

M A (fi) < p(/i) + e and M B (f 2 ) < p(/ 2 ) + e. 

Let C be the collection {a^ + bj}i<i<N ly i<j<N 2 where Ni = #(A), and 
N 2 = Now it is easy to see that 

M c (/i+/ 2 )< M c (fi) + M c (/ 2 ). 

Next, we note as a general matter that Ma(/) is the same as 
where f f = 九 is a translate of / and A f = A — h . Also the averages 
corresponding to C arise as averages of translates of the averages corre¬ 
sponding to A and so it is easy to verify that 

M c {fi) < M A (fi) and also M c (/ 2 ) S 

Thus 


P(/l + /2) S ^C(/l + /2) S ^ 4 (/ 1 ) + ^Tb(/2) S P(fl) +P(/2) + 2e. 

Letting e —^ 0 proves the sub-linearity of p. 

Next if / is Lebesgue measurable (and hence integrable since it is 
bounded), then for each A 

it 

and hence Io{f) < p(/). Let therefore I be the linear functional extend¬ 
ing I 0 from Vo to V^ whose existence is guaranteed by Theorem 5.2. It 
is obvious from its definition that p(f) < 0 if / < 0. Prom this it follows 
that /(/) < 0 when / < 0, and replacing / by —/we see that conclu¬ 
sion (a) holds. 

Next we observe that for each real h 




f(x + aj) ]dx< M A (f)dx = M A (f), 


( 15 ) 


p(f - fh) < 0 . 
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In fact, for h fixed and N given, define the set to be {/i, 2/i, 3/i, … ， Nh). 
Then the sum that enters in the definition of Ma n (/ — fh) is 

i N 

^ E (/( 工 + )") - / (工 + ( J .- 丄)")) ， 

J = 1 

and thus \M An U - fh)\ < 2M/N^ where M is an upper bound for |/|. 
Since p(f - f h ) < M An (/ - f h ) — 0, as TV — oo，we see that (15) is 
proved. This shows that /( / — fh) < 0, for all / and h. However, replac¬ 
ing / by fh and then h by —h, we see that I(fh — /) < 0 and thus (d) is 
also established, finishing the proof of Theorem 5.7. 

As a direct consequence we have the following. 

Corollary 5.8 There is a non-negative function m defined on all subsets 
o/M/Z so that: 

(i) m(Ei U E 2 ) = rh(E\) + jh(E> 2 ) for all disjoint subsets E\ and E 2 . 

(ii) m(E) = : m(E) if E is measurable. 

(iii) m(E + /i) = m(E) for every h in R. 

We need only take m(E) = I(xe)^ with / as in Theorem 5.7, where xe 
denotes the characteristic function of E. 

We now turn to the proof of Theorem 5.6. Let Xj denote the interval 
(J，J + 1]，where j G Z. Then we have a partition of M into 

disjoint sets. 

For clarity of exposition, we temporarily relabel the measure m on 
(0,1]= :Z 0 given by the corollary and call it rriQ. So whenever £ C Xo we 
defined m(E) to be mo(E). More generally, if 五 C we set m(E)= 

With these things said，for any set E define rh{E) by 

OO OO 

(16) m(E) — E m{Enx 3 )= J2 ^o((Eni 0 )-j). 

j= — oo j = — oo 

Thus m(E) is given as an extended non-negative number. Note that if 
Ei and E 2 are disjoint so are (Ei (1 工 j) _ j and (E 2 H Jj) — j. It follows 
that m(Ei U E 2 ) = rh(Ei) + rh(E 2 )• Moreover if E is measurable then 
m(E nlj) = m (五 Pi2j) and so m(E) = m(E). 

To prove m(E h) = m(E), consider first the case /i = fc G Z. This is 
an immediate consequence of the definition (16) once one observes that 
((E + fc) H Tj + k) — (j -i- k) = (E H Xj) — j, for all j, fc G Z. 
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Next suppose 0 < h < 1. We then decompose E r\Xj as E f - U E r - ^ with 
Ej — En (j，j l — h] and E f - = E H (j 1 — h^j 1]. The point of 
this decomposition is that E r - + h remains in Xj but E r - h is placed 
in 2j+i. In any case, E = (Jj ^ Uj 巧’， and the union is disjoint. 

Thus using the first additivity property proved above and then (16) 
we see that 

OO 

ME)= Y, 

j = -oo 

Similarly 

OO 

j = -oo 

Now both Ej and E f - + h are in 2j，hence m{E r -) = + h) by the 

translation invariance of rho and the definition of rh on subsets of Xj. 
Also E r - is in Xj and E’; + /i is in 2j +1 , and their measures agree for the 
same reasons. This establishes that m(E) = m(E + /i), for 0 < /i < 1. 
Now combining this with the translation invariance with respect to Z 
already proved，we obtain conclusion (iii) of Theorem 5.6 for all /i，and 
hence the theorem is completely proved. 

For the corresponding extension of Lebesgue measure in and other 
related results, see Exercise 36 and Problems 8* and 9*. 


6 Complex LP and Banach spaces 

We have supposed in Section 3.2 onwards that our L p and Banach spaces 
are taken over the reals. However, the statements and the proofs of 
the corresponding theorems for those spaces taken with respect to the 
complex scalars are for the most part routine adaptations of the real case. 
There are nevertheless several instances that require further comment. 
First, in the argument concerning the converse of Holder^ inequality 
(Lemma 4.2), the definition of / should read 


/( 工） = \ g (^)\ q ~ 


l s\gng(x) 



1 


where now “sign” denotes the complex version of the signum function, 
defined by sign ^ = z/\z\ if z # 0， and signO = 0. There are similar oc¬ 
currences with g replaced by g n . 

Second, while the Hahn-Banach theorem is valid as stated only for real 
vector spaces, a version of the complex case (sufficient for the applications 
in Section 5.3 where p(f) = ||/||) can be found in Exercise 33 below. 
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7 Appendix: The dual of C{X) 

In this appendix, we describe the bounded linear functionals of the space C(X) 
of continuous real-valued functions on X. To begin with, we assume that X is a 
compact metric space. Our main result then states that if ^ E C(X) + , then there 
exists a finite signed Borel measure fi (this measure is sometimes referred to as a 
Radon measure) so that 

忍 (f) 二 f f(x) dfi(x) for all / 6 C(X). 

J x 

Before proceeding with the argument leading to this result, we collect some basic 
facts and definitions. 

Let X be a metric space with metric d, and assume that X is compact; that is, 
every covering of X by open sets contains a finite sub-covering. The vector space 
C(X) of real-valued continuous functions on X equipped with the sup-norm 

ll/ll = sup \f(x)l fe C(X) 

xGX 


is a Banach space over R. Given a continuous function / on X we define the 
support of /, denoted supp(/), as the closure of the set {x E X : f(x) ^ 0}. 6 

We recall some simple facts about continuous functions and open and closed 
sets in X that we shall use below. 

(i) Separation. If A and B are two disjoint closed subsets of X, then there 
exists a continuous function / with / = 1 on ^4, / = 0 on and 0 < / < 1 in the 
complements of A and B. 

Indeed, one can take for instance 

f(r) = d ( x ^ B ) 
n } ~ d(x,A) + d{x,BY 

where d(x, B) = inf vG B c?(x, y), with a similar definition for d(x, A). 

(ii) Partition of unity. If K is a compact set which is covered by finitely many 

open sets {Ok}k=i^ then there exist continuous functions r]k for 1 < k < N so 
that 0 < 77 a ： < 1, supp(77A：) C Ok, and ^2^=1 ~ ^ whenever x € K. Moreover, 

0 < r]k{x) < 1 for all x e X. 

One can argue as follows. For each x € K, there exists a ball B(x) centered at x 
and of positive radius such that B(x) C Oi for some i. Since \J xeK B{x) covers K, 

we can select a finite subcovering, say |J 二 i For each I < k < N^ let Uk 

be the union of all open balls B(xj) so that B(xj) C Ok] clearly 

By (i) above, there exists a continuous function 0 < < 1 so that = 1 on Uk 

and supp(ipk) C Ok- If we define 

? 7 i =pi, 772 = ^2(1 - (pi), ... ， t]n = — ^>i) •••(! — (Pn-i) 


6 This is the common usage of the terminology “support.” In Book III， Chapter 2, we 
used “support of /” to indicate the set where f(x) ^ 0, which is convenient when dealing 
with measurable functions. 
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then supp(7? fc ) C O k and 

Vi H - h w = 1 - (1 - c^i)... (1 - 卽）， 

thus guaranteeing the desired properties. 

Recall 7 that the Borel cr-algebra of X, which is denoted by Bx, is the smallest 
(j-algebra of X that contains the open sets. Elements of Bx are called Borel sets, 
and a measure defined on Bx is called a Borel measure. If a Borel measure is 
finite, that is n(X) < oo, then it satisfies the following “regularity property”： for 
any Borel set E and any e > 0, there are an open set O and a closed set F such 
that E C O and —五） < e, while F C E and fi(E — F) < e. 

In general we shall be interested in finite signed Borel measures on X, that 
is, measures which can take on negative values. If fi is such a measure, and 
and denote the positive and negative variations of //, then fi = — , and 

integration with respect to " is defined by J f dfi = f f — f f dfi~. Conversely, 
if fj,\ and "2 are two finite Borel measures, then " = "1 一 "2 is a finite signed Borel 
measure, and f f dfi - =J fdfM 一 f fdfi 2 . 

We denote by M(X) the space of finite signed Borel measures on X. Clearly, 
M(X) is a vector space which can be equipped with the following norm 

IImII = ImI W ， 

where \fi\ denotes the total variation of //. It is a simple fact that M(X) with this 
norm is a Banach space. 

7.1 The case of positive linear functionals 

We begin by considering only linear functionals i : C(X) —^ IR. which are positive, 
that is, £(f) > 0 whenever f(x) > 0 for all x ^X. Observe that positive linear 
functionals are automatically bounded and that ||^|| = ^(1). Indeed, note that 
< ll/ll, hence ll/ll 土 / > 0, and therefore \£(f)\ < ^(1)||/||. 

Our main result goes as follows. 

Theorem 7.1 Suppose X is a compact metric space and i a positive linear func¬ 
tional on C(X). Then there exists a unique finite (positive) Borel measure \jl so 
that 

(17) £(f) = f f(x) dfi(x) for all f e C{X). 

J x 

Proof. The existence of the measure fi is proved as follows. Consider the 
function p on the open subsets of X defined by 

p{0) = sup {£(/)，where supp(/) C O, and 0 < / < 1}, 


7 r 11 

The definitions and results on measure theory needed in this section, in particular the 
extension of a premeasure used in the proof of Theorem 7.1, can be found in Chapter 6 
of Book III. 
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and let the function /jl* be defined on all subsets of X by 

…⑹ = =inf{p(0), where E C O and O is open}. 

We contend that (jl* is a metric exterior measure on X. 

Indeed, we clearly must have < "*( 五 2 ) whenever E\ C E 2 . Also, if O is 

open, then — p{0). To show that 〆* is countably sub-additive on subsets 

of X 、 we begin by proving that is in fact sub-additive on open sets {Ok}, that 

is, 

/ OO \ OO 

( 18 ) … U 叫 

\/c= 1 ) k = l 

To do so, suppose {Ok}^! is a collection of open sets in X, and let O = ur=iO k . 
If / is any continuous function that satisfies supp(/) C O and 0 < / < 1, then 
by compactness of K = supp(/) we can pick a sub-cover so that (after relabeling 
the sets Ojt, if necessary) K C UfeLi Ok- Let {r]k}k^i be a partition of unity of 
{Oi, … , On} (as discussed above in (ii)); this means that each r/fc is continuous 
with 0 < r/fc < 1, supp(77fc) C Ok and = 1 for all x E K. Hence recalling 

that //* = p on open sets, we get 

N N OO 

k= 1 fc = 1 k= 1 

where the first inequality follows because supp(/77fc) C Ok and 0 < frjk < 1. Tak¬ 
ing the supremum over / we find that (UJLi ^ I]fell 

We now turn to the proof of the sub-additivity of on all sets. Suppose {Ek} 
is a collection of subsets of X and let e > 0. For each fc, pick an open set Ok 
so that Ek C Ok and fi^iOk) < /ji*(Ek) 4 - e2~ k . Since O = [J Ok covers |J 五 fc, we 
must have by (18) that 

(UEfc) < "*(O) < ^2fi*{O k ) < ^2^{E k ) +e, 

k k 

and consequently fi^dJEk) < fi*(Ek) as desired. 

The last property we must verify is that fi* is metric, in the sense that if 
d (五 1 , 五 2 ) 〉 0， then fjL*(E\ U E 2 ) = 4 - ( 五 2 ). Indeed, the separation con¬ 

dition implies that there exist disjoint open sets G\ and O 2 so that E\ C G\ 
and E 2 C Ch. Therefore, if O is any open subset which contains E\ U ^ 2 , then 
O D [O Pi Oi) U (O Pi O 2 ), where this union is disjoint. Hence the additivity of 
on disjoint open sets, and its monotonicity give 

i^*(O) > /j ,*(0 n Oi) /jl *(0 n O2) > 4 - ( 五 2 )， 


since 五 1 C (O PI Oi) and E 2 C (O n O 2 ). So U E 2 ) > 4 - //*( 五 2 )， and 

since the reverse inequality has already been shown above, this concludes the proof 
that //* is a metric exterior measure. 
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By Theorems 1.1 and 1.2 in Chapter 6 of Book III, there exists a Borel measure 
H on Bx which extends Clearly, \i is finite with /jl(X) = ^(1). 

We now prove that this measure satisfies (17). Let / 6 C(X). Since / can be 
written as the difference of two continuous non-negative functions, we can assume 
after rescaling, that 0 < f(x) < 1 for all x ^ X. The idea now is to slice /, that is, 
write f = ^2 fn where each f n is continuous and relatively small in the sup-norm. 
More precisely, let iV be a fixed positive integer, define Oo = X, and for every 
integer n > 1, let 

O n = {x e X : f(x) > (n - 1)/N}. 

Thus On D O n -\-i and On+i = 0 - Now if we define 

( l/N if X e On+1, 

fn(x)= \ f(x)-(n-l)/N ifxeOn-O n+u 

{ 0 if X e O c n , 

then the functions f n are continuous and they “pile up” to yield /, that is, / = 
J2n=i Since Nf n = 1 on O n +i, supp(A r / n ) C O n C O n -i, and also 0 < Nf n < 
1 we have fi(O n +i) < C(Nf n ) < "(On—i), and therefore by linearity 

i N i N 

( 19 ) [ ^{O n+ i) < £(f) < At(On-i). 

n=l n=l 

The properties of Nf n also imply < f Nf n d\i < "(O n ), hence 


( 20 ) 


n = l 




n = 1 


Consequently, combining the inequalities (19) and (20) yields 


m- 



N 


In the limit as TV — oo we conclude that £(f) = J f dfi as desired. 

Finally, we prove uniqueness. Suppose 〆 is another finite positive Borel measure 
on X that satisfies i(f)=f f du! for all / e C(X). If O is an open set, and 
0 < / < 1 with supp(/) C O, then 

Af)= [ fdfi f = [ fdfi r < [ ldf/ = 

J Jo Jo 

Taking the supremum over / and recalling the definition of \i yields fi(O) < 

For the reverse inequality, recall the inner regularity condition satisfied by a finite 
Borel measure: given e 〉 0, there exists a closed set K so that K C O, and fi f {O — 
K) < e. By the separation property (i) noted above applied to K and O c , we can 
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pick a continuous function / so that 0 < / < 1, supp(/) C O and / = 1 on K. 
Then 

1 ^( 0 ) < + e < f f d^i + t< £(f) + e < fx(G) + e. 

J K 

Since e was arbitrary, we obtain the desired inequality, and therefore - = ^(0) 

for all open sets O. This implies that fi = fi on all Borel sets, and the proof of 
the theorem is complete. 

7.2 The main result 

The main point is to write an arbitrary bounded linear functional on C(X) as the 
difference of two positive linear functionals. 

Proposition 7.2 Suppose X is a compact metric space and let £ be a bounded 
linear functional on C(X). Then there exist positive linear functionals and t~ 
so that £ = £+ — t~. Moreover, ||^|| = ^ + (1) +€ _ (1). 

Proof. For / 6 C(X) with / > 0, we define 

€+(/) = sup{£(cp) ： 0 < cp < /}. 

Clearly, we have 0 < £ + (f) < ||^||||/|| and £(f) < €+(/). If a > 0 and / > 0, then 
作 /) = a£ + (/). Now suppose that /, p > 0. On the one hand we have €+(/) + 
£ + (g) < £ + {f + p), because if 0 < (/? < / and 0 < ^ip < then 
On the other hand, suppose 0 < <^ < / + p, and let pi = min(cp, /) and #2 = 9 — 
pi. Then 0 < c^i < / and 0 < ip 2 < g, and £(ip) = £{^pi) +^(^ 2 ) < €+(/) + (p). 

Taking the supremum over cp, we get (/ + p) < 彳 +(/) + We conclude 

from the above that £ + (f g) — £+(f) + ^ + (g) whenever /, p > 0. 

We can now extend £ + to a positive linear functional on C(X) as follows. Given 
an arbitrary function / in C(X) we can write f = f + — f~, where / + , / - > 0, 
and define £ + on / by i + {f) = €+(/+)— €+(/ )• Using the linearity of £ + on non¬ 
negative functions, one checks easily that the definition of £ + {f) is independent 
of the decomposition of / into the difference of two non-negative functions. From 
the definition we see that £ + is positive, and it is easy to check that £ + is linear 
on C(X), and that ||^ + || < ||^||. 

Finally, we define £~ = £ + — and see immediately that i~ is also a positive 
linear functional on C(X). 

Now since £ + and £~ are positive, we have ||€+|| = €+(l) and \\i~ || = i~ (1), 
therefore ||^|| < ^ + (1) + (1)- For the reverse inequality, suppose 0 < cp < 1. Then 

|2cp — 1| < 1, hence ||^|| > £(2<^ — 1). By linearity of and taking the supremum 
over ip we obtain ||£|| > 2’+(l) — £(1). Since £(1) = ⑴ 一 ’ 一 （ 1) we get ||’|| 仝 

i + (\) + i~ (1), and the proof is complete. 

We are now ready to state and prove the main result. 

Theorem 7.3 Let X be a compact metric space and C(X) the Banach space of 
continuous real-valued functions on X. Then，given any bounded linear functional i 
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on C(X), there exists a unique finite signed Borel measure /jl on X so that 

’(/)=/ f{x) dfi(x) for all f e C{X). 

Jx 

Moreover 7 ||^|| = ||/i|| = \fji\(X). In other words C(X)* is isometric to M(X). 

Proof. By the proposition, there exist two positive linear functionals £+ and £~ 
so that £ = £ + — t~ . Applying Theorem 7.1 to each of these positive linear func¬ 
tionals yields two finite Borel measures "i and " 2 . If we define /jl = /jl\ _ " 2 , then 
^ is a finite signed Borel measure and £(f) : =/ / d". 

Now we have 

K(/)l < f l/l%l< ll/ll M(x), 

and thus ||^|| < \fji\(X). Since we also have \/jl\(X) < /jli(X) + /jL2(X ) : = 耖 ⑴ + 
(1) = ||€||， we conclude that ||^|| = |/i|(X) as desired. 

To prove uniqueness, suppose J f dfi = f f d\i for some finite signed Borel mea¬ 
sures \i and 〆， and all / G C(X). Then if 1 / = fi — one has f fdiy = 0, and 
consequently, if and are the positive and negative variations of /, one finds 
that the two positive linear functionals defined on C(X) by i + (f) f f dtu + and 
£_(f ) 二 f f dv~ are identical. By the uniqueness in Theorem 7.1, we conclude 
that iy + ==!/_, hence 1 / = 0 and " = //， as desired. 

7.3 An extension 

Because of its later application, it is useful to observe that Theorem 7.1 has an 
extension when we drop the assumption that the space X is compact. Here we 
define the space Cb{X) of continuous bounded functions / on X, with norm ||/|| = 

SU PxGX 1/( 工 )1， 

Theorem 7.4 Suppose X is a metric space and i a positive linear functional on 
Cb(X). For simplicity assume that i is normalized so that 彳 (1) = 1. Assume also 
that for each e > 0 ， there is a compact set K e C X so that 

(21) |^(/)I < sup |/(x)| + e||/||, for all f € C b (X). 

xEK e 

Then there exists a unique finite (positive) Borel measure fi so that 

i(f) = [ f(x) d/ji(x), for all f € C b (X). 

Jx 

The extra hypothesis (21) (which is vacuous when X is compact) is a “tightness” 
assumption that will be relevant in Chapter 6. Note that as before |^(/)| < ||/|| 
since ^(1) = 1, even without the assumption (21). 

The proof of this theorem proceeds as that of Theorem 7.1, save for one key 
aspect. First we define 

p(P) = sup {€(/), where / € C b (X), supp(/) C O, and 0 < / < 1}. 
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The change that is required is in the proof of the countable sub-additivity of 
p, in that the support of /’s (in the definition of p(0)) are now not necessarily 
compact. In fact, suppose O = U?Li is a countable union of open sets. Let C be 
the support of /, and given a fixed e > 0, set K = C n K e ^ with K e the compact 
set arising in (21). Then K is compact and IJ 二 i covers K. Proceeding as 

before, we obtain a partition of unity { 77 ^}^-!, with 7 ^ supported in Ok and 
= 1, for x E K. Now / — fr]k vanishes on K e . Thus by (21) 


N 

fc=1 


< e, 


and hence 

OC 

k = l 

Since this holds for each e, we obtain the required sub-additivity of p and thus 
of iu. The proof of the theorem can then be concluded as before. 


Theorem 7.4 did not require that the metric space X be either complete or 
separable. However if we make these two further assumptions on X, then the 
condition (21) is actually necessary. 

Indeed, suppose £(f) = f x f d/jL, where " is a positive finite Borel measure on X, 
which we may assume is normalized, /jl(X) = 1. Under the assumption that X is 
complete and separable, then for each fixed e > 0 there is a compact set K e so 
that 〈 e. Indeed，let {cfc j - be a dense sequence in X. Since for each 771 

the collection of balls {B 1 / m (cfc)}JL 1 covers X, there is a finite Nm so that if 

Om = Ufcii Bi/m(ck), then fi(Om) > 1 - e/2 m . 

Take K e = p|^ =1 O m . Then "(i^) > 1 — e; also, K e is closed and totally 
bounded, in the sense that for every (5 > 0, the set K e can be covered by finitely 
many balls of radius 6. Since X is complete, K e must be compact. Now (21) 
follows immediately. 


8 Exercises 

1. Consider L p — L p (R d ) with Lebesgue measure. Let fo(x) — |x「 a if |x| < 1, 
fo(x) = 0 for |x| > 1; also let foo{x) = |x| _Qt if |x| > 1 , foo(x) = 0 when \x\ < 1 . 
Show that: 

(a) fo € L p if and only if pa < d. 

(b) foo 6 L p if and only if d < pa. 

(c) What happens if in the definitions of fo and foo we replace \x\~ a by 
|x| _a /(log(2/|x|)) for \x\ < 1, and |x|- a by |x|~ a /(log(2|x|)) for \x\ > 1? 


2. Consider the spaces L p (R d )^ when 0 < p < 00 . 
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(a) Show that if ||/ + 9\\lp < ||/||lp 4 - \\g\\LP for all / and p, then necessarily 

p > 1. 

(b) Consider L P (R) where 0 < p < 1. Show that there are no bounded linear 
functionals on this space. In other words, if ^ is a linear function L P (IR) ^ C 
that satisfies 

|^(/)| < M||/|| lp ( r ) for all / € L P (R) and some M > 0, 
then { 二 0. 

[Hint: For (a), prove that if 0 < p < 1 and x,y > 0, then x p + y p > (x y) p . 
For (b), let F be defined by F(x) = £{xx), where Xx is the characteristic func¬ 
tion of [0,x], and consider F(x) — F(?/).] 

3. If / 6 L p and g G L q , both not identically equal to zero, show that equality 
holds in Holder’s inequality (Theorem 1.1) if and only if there exist two non-zero 
constants a, 6 > 0 such that a\f(x)\ p = b\g(x)\ q for a.e. x. 

4. Suppose X is a measure space and 0 < p < 1. 

(a) Prove that ||/p|| L i > ||/||lp IIpIIli- Note that q, the conjugate exponent of 
p, is negative. 

(b) Suppose /i and ,2 are non-negative. Then ||/i 4 - / 2 ||lp > ||/i||lp + ||/ 2 "lp. 

(c) The function d(f, g) = ||/ — g\\ p LP for f ， g 三 L p defines a metric on L P (X). 

5. Let X be a measure space. Using the argument to prove the completeness 
of L P (X), show that if the sequence {/ n } converges to / in the L p norm, then a 
subsequence of {/ n } converges to / almost everywhere. 

6. Let (X, T^ fi) be a measure space. Show that: 

(a) The simple functions are dense in L°°(X) if fi(X) < oo, and; 

(b) The simple functions are dense in L P (X) for 1 < p < oo. 

[Hint: For (a), use = {x € X : < f(x) < M (; +1 ) } where —j < £ < j, and 

M = ||/|| L oo. Then consider the functions fj that equal M£/j on For (b) use 
a construction similar to that in (a).] 

7. Consider the L p spaces, 1 < p < oo, on R d with Lebesgue measure. Prove that: 

(a) The family of continuous functions with compact support is dense in L p , 
and in fact: 

(b) The family of indefinitely differentiable functions with compact support is 
dense in L p • 
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The cases of L 1 and L 2 are in Theorem 2.4, Chapter 2 of Book III, and Lemma 3.1 ， 
Chapter 5 of Book III. 

8. Suppose 1 < p < oo, and that R d is equipped with Lebesgue measure. Show 
that if / ^ L p (R d ), then 

II/(t h) — f (x) \\l,p — > 0 as |/i| —> 0. 

Prove that this fails when p = oo. 

[Hint: By the previous exercise, the continuous functions with compact support 
are dense in L p (IR d ) for 1 < p < oo. See also Theorem 2.4 and Proposition 2.5 in 
Chapter 2 of Book III.] 


9. Suppose X is a measure space and l < po < pi < oo. 

(a) Consider L P0 PI L P1 equipped with 

11/IIlpohlpi = II/IIlpo 4 - ||/||lpi - 

Show that || . Hlpodlpi is a norm, and that Z/ P0 fl L P1 (with this norm) is a 
Banach space. 

(b) Suppose L P0 4 - L P1 is defined as the vector space of measurable functions / 
on X that can be written as a sum f 二 fo J\ with fo ^ L P0 and /i ^ L P1 . 
Consider 


||/||lpo+lpi = inf {||/o||lpo + ||/i||lpi }, 


where the infimum is taken over all decompositions / = /o + /i with fo ^ 
L po and /i € L P1 . Show that || . H^po+lpi is a norm, and that L P0 + L P1 
(with this norm) is a Banach space. 

(c) Show that L p C L P0 + L P1 if p 0 <p<Pi. 


10. A measure space (X,/i) is separable if there is a countable family of measur¬ 
able subsets {Ek}kLi so that if E is any measurable set of finite measure, then 

fl (^ E ) ― ^ 0 cLS h ― > 0 

for an appropriate subsequence {njt} which depends on E. Here AAB denotes the 
symmetric difference of the sets A and B, that is, 

AAB = {A - B)U{B - A). 

(a) Verify that R d with the usual Lebesgue measure is separable. 

(b) The space L P (X) is separable if there exists a countable collection of ele¬ 
ments {/ n }^Li in L p that is dense. Prove that if the measure space X is 
separable, then L p is separable when 1 < p < oo. 
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11. In light of the previous exercise, prove the following: 

(a) Show that the space L°° (R) is not separable by constructing for each a 6 R 
an f a 6 L°°, with ||/ a - fb\\ > 1, if a / 6. 

(b) Do the same for the dual space of L oc (R). 


12. Suppose the measure space (X,//) is separable as defined in Exercise 10. Let 
1 <P < oo and 1/p + 1/q = 1. A sequence {/ n } with f n 6 L p is said to converge 
to f e L p weakly if 


( 22 ) 



for every g 6 L q . 


(a) Verify that if ||/ — f n \\LP —> 0, then f n converges to / weakly. 

(b) Suppose sup n WUWlp < oo. Then, to verify weak convergence it suffices to 
check (22) for a dense subset of functions g in L q . 

(c) Suppose 1 < p < oo. Show that if sup n ||/ n ||LP < oo, then there exists / 6 
L p , and a subsequence {njt} so that / nfc converges weakly to /. 

Part (c) is known as the “weak compactness” of L p for 1 < p < oo, which fails 
when p = 1 as is seen in the exercise below. 


[Hint: For (b) use Exercise 10 (b).] 


13. Below are some examples illustrating weak convergence. 

(a) f n {x) = sin(27rnx) in L p ([0,1]). Show that / n —> 0 weakly. 

(b) fn(x )= : n l ^ P x{ nx ) in L P (R). Then / n —> 0 weakly if p > 1, but not when 
p — 1. Here x denotes the characteristic function of [0,1]. 

(c) f n (x) = 1 + sin(27rnx) in /^([0,1]). Then / n —> 1 weakly also in /^([0,1])， 
H/tiIIli = 1, but ||/ n — l|| L i does not converge to zero. Compare with Prob¬ 
lem 6 part (d). 

14. Suppose X is a measure space, 1 < p < oo, and suppose {/ n } is a sequence of 
functions with ||/n||LP < M < oo. 

(a) Prove that if / n —^ / a.e. then / n —> / weakly. 

(b) Show that the above result may fail if p = 1. 

(c) Show that if f n —► /i a.e. and / n — ,2 weakly, then fi = f 2 a.e. 


15 - Minkowski’s inequality for integrals. Suppose (Xi,/ii) and d ， " 2 ) 
are two measure spaces, and 1 < p < oo. Show that if /(xi,X 2 ) is measurable on 
x X 2 and non-negative, then 

/ f{xi,X 2 )dfl 2 < / \\f{xi,X2)\\LP(X 1 ) d\X^ 

j LP(Xi) J 
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Extend this statement to the case when / is complex-valued and the right-hand 
side of the inequality is finite. 

[Hint: For 1 < p < oo, use a combination of Holder 5 s inequality, and its converse 
in Lemma 4.2.] 

16. Prove that if fj 6 L Pj (X), where X is a measure space, j = 1,, N, and 
1 /Pj = 1 with Pj > ^ then 


j=i j=i 

This is the multiple Holder inequality. 


17. The convolution of / and g on R d equipped with the Lebesgue measure is 
defined by 


(/* P ) ⑷ 



f (工一 y)g{y)dy. 


(a) If / 6 L p , 1 < p < oo, and p 6 L 1 , then show that for almost every x the 
integrand f(x — y)g{y) is integrable in hence / * p is well defined. More¬ 
over, / * p G L p with 


11/ * 9\\lp < II/IIlpIIpIIl 1 - 


(b) A version of (a) applies when g is replaced by a finite Borel measure fi: if 
/ 6 L p , with 1 < p < oo, define 


(/*") ( 工 ) 



/(x - y)d/j,{y), 


and show that ||/ * /jl\\lp <ll/IUpH(M d ). 

(c) Prove that if f G L p and g G L q , where p and q are conjugate exponents, then 
f * g E L°° with 11/ * < ||/||lp ||p||l<? - Moreover, the convolution f 木 g 

is uniformly continuous on M, and if 1 < p < oo, then lirxi| 工 |—oo(/ * g){x )= 

0. 


[Hint: For (a) and (b) use the Minkowski inequality for integrals in Exercise 15. 
For part (c), use Exercise 8.] 


18. We consider the L p spaces with mixed norm, in a special case that is useful 
is several contexts. 

We take as our underlying space the product space {(x, 亡 ） } = R d x R, with the 
product measure dx dt, where dx and dt are Lebesgue measures on R d and R 
respectively. We define Ll(L^) — L p ' r , with 1 < p < oo, 1 < r < oo, to be the 
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space of equivalence classes of jointly measurable functions f (x, t) for which the 
norm 

ll/lkp， r = ( f ( [ \f(x,t)\ p dx) P dt 
\Jr \JR d / 



is finite (when p < oo and r < oo), and an obvious variant when p = oo or r = oo. 

(a) Verify that L p,r with this norm is complete, and hence is a Banach space. 

(b) Prove the general form of Holder’s inequality in this context 

[\f(x,t)g(x,t)\dxdt < ||/||y ||p||w， 

«/R d xK 


with 1/p + 1 /p = 1 and 1 /r + 1 /r’ = 1 • 

(c) Show that if / is integrable over all sets of finite measure, then 


||/|| LP ，r =： sup 


/ f[x,t)g[x ， t)dxdt 
jR d xR 


with the sup taken over all g that are simple and ||p|| LP / ir / < 1. 

(d) Conclude that the dual space of L p,r is L p ,r , if 1 < p < oo, and 1 < r < oo. 


19. Young’s inequality. Suppose 1 < p, r < oo. Prove the following on R d : 


11/ * p||L9 < II/IIlp \\g\\Lr whenever 1/q = 1/p + 1/r - 1. 


Here, f 岑 g denotes the convolution of / and g as defined in Exercise 17. 
[Hint: Assume /, p > 0, and use the decomposition 

f(y)g [ 工 -y) = f(y)H y) b [f(y) 1 ~ a 9(^~ 


for appropriate a and b, together with Exercise 16 to find that 


J f(y)g{^ - y)dy 


< ll/"m /r (/ \f(y)\ p \ 9 (x-y)\ r d y y .} 


20. Suppose X is a measure space, 0 < po < p < pi < oo, and / G L P0 (X) fl 
L P1 (X). Then / E L P (X) and 

II/IIlp < II/HIpo II/IIlpi , if t is chosen so that * = g + 六. 

21. Recall the definition of a convex function. (See Problem 4, Chapter 3, in 
Book III.) Suppose cp is a non-negative convex function on R and / is real-valued 
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and integrable on a measure space X, with /jl(X) = 1. Then we have Jensen’s 

inequality ： 





Af) d ^ 


Note that if (p(t) = \t\ p , 1 < p, then ip is convex and the above can be obtained 
from Holder’s inequality. Another interesting case is (p(t) = e at • 

[Hint: Since cp is convex, one has, a j x j) ^ a j ( P( x j)^ whenever aj’Xj 

are real, aj > 0, and YljLi a j = 1-1 


22. Another inequality of Young. Suppose cp and ip are both continuous, 
strictly increasing functions on [0, oo) that are inverses of each other, that is, 
(cp o = x for all x > 0. Let 

px rx 

屯 (x) = I (p(u) du and ^(x) = / 寸 (u) du. 

Jo Jo 

(a) Prove: ab < ^(a) 4 - 屯⑻ for all a,b > 0. 

In particular, if cp(x) = x p ~ l and ^{y) — y q 一 1 with 1 < p < oo and 1/p + 
l/q = 1, then we get $(x) = x p /p, 少 （ y) = y q jq, and 

A e B l ~ e < 6A + (1 — 6)B for all A,B >0 and 0 < ^ < 1. 


(b) Prove that we have equality in Young’s inequality only if b = (f(a) (that is, 

a = 4 ⑻). 

[Hint: Consider the area ab of the rectangle whose vertices are (0,0), (a, 0), (0, b) 
and (a, 6), and compare it to areas “under” the curves y = $(x) and x = 增 ] 


23. Let (X, fi) be a measure space and suppose 伞⑴ is a continuous, convex, and 
increasing function on [0, oo), with $(0) = 0. Define 

= {/ measurable : / ^(\f(x)\/M) d" < oo for some M > 0}, 

J x 


and 



=inf 

M>0 





Prove that: 

(a) is a vector space. 

(b) || • || L 4> is a norm. 


(c) is complete in this norm. 
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The Banach spaces are called Orlicz spaces. Note that in the special case 
少 ⑷ =^ p , 1 < p < oo, then = L p . 

[Hint: Observe that if / E , then lim^_>oc f x ^(\f\/N) dfi = 0. Also, use the 
fact that there exists ^4 > 0 so that ^(t) > At for all t > 0.] 


24. Let 1 < po < Pi < oo. 

(a) Consider the Banach space L Po Pi L P1 with norm 11/lli^onLPi = ||/||lpo + 
||/||lpi. (See Exercise 9.) Let 





t p ° 



if 0 < ^ < 1, 
if 1 < 纟 < oo. 


Show that with its norm is equivalent to the space L po Pi L P1 . In other 
words, there exist A^B > 0^ so that 

^II/IIlporlpi < II/IIl 4, ^ ^II/IIlporlpi - 

(b) Similarly, consider the Banach space L Po 4 - L Pl with its norm as defined in 
Exercise 9. Let 

if 0 < u < 1, 
if 1 < u < oo. 


m 


I U P1 一 * 

^(u) du where ^( u ) — ^ u Po ~ l 




o 


Show that with its norm is equivalent to the space L po 4 - L P1 . 


25. Show that a Banach space ^ is a Hilbert space if and only if the parallelogram 
law holds 

ll / + 5 l | 2 + ll /_ 5 l | 2 = 2(||/|| 2 + || 5 || 2 ). 

As a consequence, prove that if L p (R d ) with the Lebesgue measure is a Hilbert 
space, then necessarily p = 2. 

[Hint: For the first part, in the real case, let (/, g) : = i(ii/+pir + n/-pii 2 ).] 

26. Suppose 1 < po’pi < oo and 1/po 4 - 1/ 如 =1 and 1/pi + 1/qi = 1. Show that 
the Banach spaces L po Pi L P1 and L qo + L qi are duals of each other up to an 
equivalence of norms. (See Exercise 9 for the relevant definitions of these spaces. 
Also, Problem 5* gives a generalization of this result.) 

27. The purpose of this exercise is to prove that the unit ball in L p is strictly 
convex when 1 < p < oo, in the following sense. Here L p is the space of real- 
valued functions whose p th power are integrable. Suppose ||/o||lp = ||/i||lp = 1, 
and let 

ft = — t)fo tfi 

be the straight-line segment joining the points /o and /i. Then ||/t||LP < 1 for all 
t with 0 < ^ < 1, unless /o = /i. 
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(a) Let f e L p and g e L q , 1/p + 1/q = 1, with ||/|| LP = 1 and ||p|| L<7 = 1. Then 

J fgd/j, = l 

only when f(x) = signp(x)|p(x) 卜 _1 . 

(b) Suppose ||/(/ ||lp = 1 for some 0 〈亡 ’ < 1. Find g E L q , ||p||l 9 — 1, so that 

J ftfgdfi = l 

and let F(t) = f f t g d/ji. Observe as a result that F(t) = 1 for all 0 < ^ < 1. 
Conclude that ft = /o for all 0 < ^ < 1. 

(c) Show that the strict convexity fails when p = 1 or p = oo. What can be said 
about these cases? 

A stronger assertion is given in Problem 6 + . 

[Hint: To prove (a) show that the case of equality in A 6 B 1 ~° < OA + (1 — 0)B, for 
B > 0 and 0 < ^ < 1 holds only when A = B.] 

28. Verify the completeness of A a (lR d ) and L^(IR d ). 

29. Consider further the spaces A a (R d ). 

(a) Show that when a > 1 the only functions in A a (R d ) are the constants. 

(b) Motivated by (a), one defines C k,0t (R. d ) to be the class of functions / on R d 
whose partial derivatives of order less than or equal to k belong to A a (IR d ). 
Here fc is an integer and 0 < a < 1. Show that this space, endowed with the 
norm 

\\f\\c^ - E 心―， 

A a (R d ) 

is a Banach space. 


30. Suppose ^ is a Banach space and 5 is a closed linear subspace of B. The 
subspace S defines an equivalence relation f 〜 g to mean f — g E S. ^B/S denotes 
the collection of these equivalence classes, then show that J3/S is a Banach space 
with norm ||/||e/s = inf^|/’|| s ，/’ 〜 /). 

31. If is an open subset of then one definition of can be taken to be the 

quotient Banach space B/S^ as defined in the previous exercise, with B = L^(R d ) 
and S the subspace of those functions which vanish a.e. on Q. Another possible 
space, that we will denote by L^(fi°), consists of the closure in L 《 (R d ) of all / 
that have compact support in Q,. Observe that the natural mapping of to 
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L p k {0) has norm equal to 1. However, this mapping is in general not surjective, 
prove this in the case when is the unit ball and k > 1. 


32. A Banach space is said to be separable if it contains a countable dense subset. 
In Exercise 11 we saw an example of a Banach space B that is separable, but where 

is not separable. Prove, however, that in general when B* is separable, then B 
is separable. Note that this gives another proof that in general L 1 is not the dual 
of L°°. 

33. Let y be a vector space over the complex numbers C, and suppose there exists 
a real-valued function p on V satisfying: 

J p(av) = |a|p(v), if a 6 C, and v G V, 

\ p(vi -\~V 2 )< p(vi) 4 - P(V 2 ), if v\ and V 2 G V. 

Prove that if Vo is a subspace of V and £o a linear functional on Vo which satisfies 
|^o(/)| < p{f) for all / € Vo, then £o can be extended to a linear functional ^ on V 
that satisfies |^(/)| < p(f) for all f G V. 

[Hint: If u = Re(^o), then £o(v) = u(v) — iu(iv). Apply Theorem 5.2 to u.] 

34. Suppose ^ is a Banach space and S a closed proper subspace, and assume 
/o ^ S. Show that there is a continuous linear functional i on 谷 ， so that £(f) = 0 
for f E S, and £(f 0 ) = 1. The linear functional i can be chosen so that ||^|| = 1/d 
where d is the distance from /o to S. 

35. A linear functional ^ on a Banach space B is continuous if and only if {/ 6 ^ ： 
£{f) = 0} is closed. 

[Hint: This is a consequence of Exercise 34.] 

36. The results in Section 5.4 can be extended to ^-dimensions. 

(a) Show that there exists an extended-valued non-negative function rh defined 
on all subsets of R d so that (i) m is finitely additive; (ii) m(E) = m(E) 
whenever E is Lebesgue measurable, where m is Lebesgue measure; and 
m(E - h) = m(E) for all sets E and every h 6 R d . Prove this is as a conse¬ 
quence of (b) below. 

(b) Show that there is an “integral” I, defined on all bounded functions on 
R d /Z d , so that 1(f) > 0 whenever / > 0; the map / i—^ 1(f) is linear; 1(f)= 
f R d/ Z d f dx whenever / is measurable; and I[fh) = /(/) where fh(x) = f(x — 

h), and h 


9 Problems 

The spaces L°° and L 1 play universal roles with respect to all Banach spaces 
m the following sense. 
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(a) If B is any separable Banach space, show that it can be realized without 
change of norm as a linear subspace of L°°(Z). Precisely, prove that there 
is a linear operator i oi B into L°°(Z) so that ||i(/)||L°°(z) = ll/lls for all 
f eB. 

(b) Each such B can also be realized as a quotient space of L 1 (Z). That is, there 
is a linear surjection P of L X (Z) onto so that if 5 = {x € L X (Z) : P{x)= 
0}, then \\P(x)\\b = inf vG 5 \\x + 2/||l1(z)^ f° r each x € L 1 (7j). This gives an 
identification of B (and its norm) with the quotient space L 1 (Z)/5 (and its 
norm), as defined in Exercise 30. 

Note that similar conclusions hold for L°°(X) and L 1 (X) if X is a measure space 
that contains a countable disjoint collection of measurable sets of positive and 
finite measure. 

[Hint: For (a), let {/ n } be a dense set of non-zero vectors in and let £ n € 
B* be such that ||€ n fj 0 * = 1 and £ n {fn) = ||/n||. If / € 5, set i(f) = {^n(/)}^°oo- 
For (b), if x = {x n } € L l (Z), with 5]!°^ \x n \ = ||x|| L i (z) < oo, define P by P(x)= 

E-oc^/n/H/nlM 

2. There is a “generalized limit” L defined on the vector space V of all real 
sequences {5 n }^Li that are bounded, so that: 

(i) L is a linear functional on V. 

(ii) L({5 n }) > 0 if 5 n > 0, for all n. 

(iii) L({s n }) = lim n _oo s n if the sequence {s n } has a limit. 

(iii) L({s n }) = L({s n +k}) for every k > l. 

(iii) L({s n }) = L({5 n /}) if 5 n — 5^ 0 for only finitely many n. 

[Hint: Let p({5 n }) = limsup n _ >oc (^ + ' n ' +5 -), and extend the linear functional L 
defined by L({5 n }) = lim n _,oo s n , defined on the subspace consisting of sequences 
that have limits.] 

3. Show that the closed unit ball in a Banach space B is compact (that is, if 

f n 6 \\fn\\ < 1, then there is a subsequence that converges in the norm) if and 

only if B is finite dimensional. 

[Hint: If 5 is a closed subspace of then there exists x E B with ||x|| = 1 and the 
distance between x and S is greater than 1/2.] 

4. Suppose X is a cr-compact measurable metric space, and Cb{X) is separable, 
where Cb{X) denotes the Banach space of bounded continuous functions on X 
with the sup-norm. 

(a) If {/jL n }^Li is a bounded sequence in M(X), then there exists a " € M(X) 
and a subsequence {/jL nj so that fjL nj converges to fi in the following 
(weak*) sense: 

/ g[x)d/jL nj (x) 一 g(x) dfi(x), for all g € C b (X). 

J x J x 
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(b) Start with a "o € M(X) that is positive, and for each / € L l (/j ， o) consider 
the mapping / h fd/jLo. This mapping is an isometry of L 1 (/io) to the 
subspace of M(X) consisting of signed measures which are absolutely con¬ 
tinuous with respect to "o. 

(c) Hence if {/ n } is a bounded sequence of functions in L 1 (/io), then there 
exist a // € M(X) and a subsequence {f nj } such that the measures f nj d/j，o 
converge to \i in the above sense. 


5.* Let X be a measure space. Suppose (p and jp are both continuous, strictly 
increasing functions on [0, oo) which are inverses of each other, that is, ((/? o ^){x )= 
x for all x > 0. Let 

px px 

电 (x) = I <f(u) du and ^(x) = / ^(u) du. 

Jo Jo 

Consider the Orlicz spaces L^(X) and (X) introduced in Exercise 23. 

(a) In connection with Exercise 22 the following Holder-like inequality holds: 

J \fg\ < CH/I^^ HpIIl^ for some (7 > 0, and all / G L 中 and g € . 


(b) Suppose there exists c > 0 so that ^>(2^) < c 少⑴ for all 亡 > 0. Then the dual 
of is equivalent to . 

6.* There are generalizations of the parallelogram law for L 2 (see Exercise 25) that 
hold for L p . These are the Clarkson inequalities: 

(a) For 2 < p < oo the statement is that 

P <h\\f\\ p LP ^\\9\\ p LP )^ 
lp Z 

(b) For 1 < p < 2 the statement is that 

q <\m\% + \\g\\ p L v) q/p , 

LP Z 

where 1/p 1/q = 1. 

(c) As a result, L p is uniformly convex when 1 < p < oo. This means that 

there is a function 6 : : 5 p (e), with 0 < 5 < 1, (and 8(e) — > 0 as e —> 

0), so that whenever ||/||lp = ||p||lp = 1, then ||/ — p||lp > e implies that 

II 中 II S u 

This is stronger than the conclusion of strict convexity in Exercise 27. 


/ + p 

q 

_L 

f ~9 

2 

卞 

LP 

2 


/ + P 

P 

f ~9 

2 

卞 

LP 

2 
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(d) Using the result in (c), prove the following: suppose 1 < p < oo, and the 
sequence {/n}, fn € L p , converges weakly to /• If ||/ n ||LP -> ||/||lp, then 
f n converges to / strongly, that is, ||/ n - /||lp - > 0 as n — oo. 

7* An important notion is that of the equivalence of Banach spaces. Suppose 
B\ and 82 are a pair of Banach spaces. We say that B\ and 82 are equivalent 
(also said to be “isomorphic”）if there is a linear bijection T between B\ and 82 
that is bounded and whose inverse is also bounded. Note that any pair of finite¬ 
dimensional Banach spaces are equivalent if and only if their dimensions are the 
same. 

Suppose now we consider L P (X) for a general class of X (which contains for 
instance, X = M. d with Lebesgue measure). Then: 

(a) L p and L q are equivalent if and only if p = q. 

(b) However, for any p with 1 < p < 00 , L 2 is equivalent with a closed infinite¬ 
dimensional subspace of L p . 


8.* There is no finitely-additive rotationally-invariant measure extending Lebesgue 
measure to all subsets of the sphere S d when d >2, in distinction to what happens 
on the torus R d /Z d when d>2. (See Exercise 36). This is due to a remarkable 
construction of Hausdorff that uses the fact that the corresponding rotation group 
of S d is non-commutative. In fact, one can decompose S 2 into four disjoint sets 
C and Z so that (i) Z is denumerable, (ii) ^4 〜 B 〜 （ 7, but ^4 〜 (B U (7). 
Here the notation 山〜也 means that A\ can be transformed into A 2 via a 
rotation. 

9_* As a consequence of the previous problem one can show that it is not possible to 
extend Lebesgue measure on R d , d > 3, as a finitely-additive measure on all subsets 
of R d so that it is both translation and rotation invariant (that is, invariant under 
Euclidean motions). This is graphically shown by the “Banach-Tarski paradox ”： 
There is a finite decomposition of the unit ball B\ : =(J^_ j Ej , with the sets Ej 

disjoint, and there are corresponding sets Ej that are each obtained from Ej by 
a Euclidean motion, with the Ej also disjoint, so that (J j=1 Ej = B 2 the ball of 
radius 2. 



L p Spaces in Harmonic 
Analysis 


The important part played in Hilbert’s treatment of 
Fredholm theory of integral equations by functions 
whose squares are summable is well-known, and it was 
inevitable that members of the Gottingen school of 
mathematics should be led to set themselves the task 
of proving the converse of ParsevaFs theorem.... On 
the other hand, efforts made to extend these isolated 
results to embrace cases in which the known or un¬ 
known index of summability is other than 2, appear 
to have failed... 

W. H. Young, 1912 


… I have proved that two conjugate trigonometric se¬ 
ries are at the same time the Fourier series of L p func¬ 
tions, p > 1. That is, if one is, so is the other. My 
proof is unrelated to the theorem of Young-Hausdorff." 

M. Riesz，letter to G. H. Hardy, 1923 


Some months ago you wrote “… I have proved that 
two conjugate.，. L p functions, p> 1”. I want the 
proof. Both I and my pupil Titchmarsh have tried 
in vain to prove it...” 

G. H, Hardy, letter to M. Riesz 1923 


The fact that L p spaces were bound to play a significant role in har¬ 
monic analysis was understood not long after their introduction. Viewed 
from that early perspective, these spaces stood at the nexus between 
Fourier series and complex analysis，this connection having been given 
by the Cauchy integral and the related conjugate function. For this rea¬ 
son methods of complex function theory predominated in the beginning 
stages of the subject, but they had to give way to “real” methods so as 
to allow the extension of much of the theory to higher dimensions. 

It is the aim of this chapter to show the reader something about both 
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of these methods. In fact, the real-variable ideas that will be introduced 
here will also be further exploited in the next chapter, when studying 
singular integral operators in R d . 

The present chapter is organized as follows. We begin with an initial 
view of the role of L p in the context of Fourier series, together with a 
related convexity theorem for operators acting on these spaces. Then we 
pass to M. Riesz’s proof of the L p boundedness of the Hilbert transform, 
an iconic example of the use of complex analysis in this setting. 

Form this we turn to the real-variable methods, starting with the max¬ 
imal function and its attendant “weak-type” estimate. The importance 
of the weak-type space is that it provides a useful substitute for L 1 when, 
as in many instances, L l estimates fail. We also study another significant 
substitute for L 1 , the “real” Hardy space Hj. It has the advantage that 
it is a Banach space and that its dual space (a substitute for L°°) is the 
space of functions of bounded mean oscillation. This last function space 
is itself of wide interest in analysis. 

1 Early Motivations 

An initial problem considered was that of formulating an L p analog of 
the basic L 2 Parseval relation for functions on [0,2 丌 ]. This theorem 

states that ii a n = ~ f(6)e~ lnd d6 denotes the Fourier coefficients of 
a function / in 1/ 2 ([0,2 丌 ])， usually written as 

DO 

⑴ 綱 〜 E 

n= —oo 

then the following fundamental identity holds: 

27T 

\f(O)\ 2 d0. 

Conversely, if {a n } is a sequence for which the left-hand side of (2) is 
finite，then there exists a unique / in L 2 ([0,27r]) so that both (1) and (2) 
hold. Notice, in particular, if / G L 2 ([0, 2 丌 ]), then its Fourier coefficients 
{a n } belong to L 2 (Z) = £ 2 (Z). 1 The question that arose was: is there an 
analog of this result for L p when p ^ 2? 

Here an important dichotomy between the case p > 2 and p < 2 occurs. 
In the first case, when / G L p ([0,27r])，although / is automatically in 
L 2 ([0, 27r]), examples show that no better conclusion than ^ |a n | 2 < oo 



1 See for instance Section 3 in Chapter 4 of Book III. 
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is possible. On the other hand, when p < 2 one can see that essentially 
there can be no better conclusion than ^ \a n \ q < oc，with q the dual 
exponent of p. Analogous restrictions must be envisaged when the roles 
of f and {a n } are reversed. 

In fact, what does hold is the Hausdorff-Young inequality: 

and its “dual” 

/ 1 r 27 ^ \ l/q 、 i/p 

⑷ ^ 2 ^ J - ( ^ \ a n\ p J ， 


both valid when 1 < p < 2 and 1/p -h l/q = 1. (The case q = oo corre¬ 
sponds to the usual L°° norm.) These may be viewed as intermediate 
results, between the case p = 2 corresponding to Parseval’s theorem, and 
its “trivial” case p = 1 and q = oo. 


A few words about how the inequalities (3) and (4) were first attacked 
are in order, because they contain a useful insight about L p spaces: often, 
the simplest case arises when p (or its dual) is an even integer. Indeed, 
when, for example g = 4, a function belonging to L 4 is the same as its 
square belonging to L 2 , and this sometimes allows reduction to the easier 
situation when p — 2. To see how this works in the present situation, let 
us take q = 4 (and p = 4/3) in (3). With / given in L p , we denote by T 
the convolution of / with itself, 




2tt 



- d<f. 


By the multiplicative property of Fourier coefficients of convolutions we 
have 


n= — oo 


with {a n } the Fourier coefficients of /. ParsevaPs identity applied to T 
then yields 




\T{6)\ 2 d6, 
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where F(z) = y^ n _ Q a n z n is the analytic function in the unit disc 1:1 < 
given as the Cauchy integral (projection) of /, namely: 


►27T 


F(z) 


m 


2iri 


e 


ie 


ie i0 d6. 


z 


and Young，s inequality for convolutions (the periodic analog of Exer¬ 
cise 19, Chapter 1) gives 

II，"〆 II/II 2 l4 /3, 

proving (3) when p = 4/3 and q = A. 

Once the case q = A has been established, the cases corresponding to 
q = 2fc，where fc is a positive integer, can be handled in a similar way. 
However the general situation, 2 < q < oo, corresponding to 1 < P < 2, 
involves further ideas. 

In contrast to the above ingenious but special argument, in turns out 
that there is a general principle of great interest that underlies such 
inequalities, which in fact leads to direct and abstract proofs of both (3) 
and (4). This is the M. Riesz interpolation theorem. Stated succinctly, 
it asserts that whenever a linear operator satisfies a pair of inequalities 
(like (3) for p = 2 and p = 1), then automatically the operator satisfies 
the corresponding inequalities for the intermediate exponents: here all p 
for 1 < p < 2, and q with 1/p + l/q = 1. The formulation and proof of 
this general theorem will be our first task in the next section. 

Before we turn to that, we will describe briefly another initial source for 
the role of L p in harmonic analysis, one which highlights its connection 
with complex analysis. 

Together with the Fourier series (1) for f in L 2 ，one considers its 
“conjugate function” or “allied series” ， defined by 

(5) />) 〜， 

n= —oo 

where sign(n) = 1 if n > 0, sign(n) = —1 if n < 0, and sign(n) 二 0 when 
n = 0. 2 

The significance of this definition is that 
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Figure 1. The Cauchy integral F(z) is defined for \z\ < 1, while f(6) is 
defined for z 二 e lG • 


Moreover, if / is real-valued (that is a n = aZ^), so is / and thus / + ao 
and / represent respectively the real and imaginary parts of the boundary 
values of the analytic function 2F in the unit disc. 

The key L 2 identity linking / and / is a simple consequence of Parse- 
val’s relation: 

1 / *27T 1 /*27T 

⑹ ^ J |/(0)| 2 + |a 0 | 2 = — j \f{0)\ 2 do. 

An early goal of the subject was the extension of this theory to L p , and 
it was also achieved by M. Riesz. 

As he tells it, he was led to the discovery of his result when preparing to 
administer a “licenciat” exam to a rather mediocre student. One of the 
problems on the exam was to prove (6). To quote Riesz: “… However it 
was quite obvious that my candidate did not know Parseval’s theorem. 
Before giving him the problem, I had therefore to think if there was 
another way for him to arrive at the required conclusion. I immediately 
realized that it was Cauchy’s theorem that was at the source of the result, 
and this observation led me quite directly to the solution of the general 
problem, a question that had longtime occupied me.” 

What Riesz had in mind was the following argument. If we assume for 
simplicity that ao = 0, then under the (technical) assumption that the 
analytic function F is actually continuous in the closure of the unit disc, 
one has by the mean-value theorem (as a simple consequence of Cauchy’s 


2 Incidentally the conjugate function is the “symmetry-breaking” operator relevant to 
the divergence of Fourier series considered in Book I. 
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theorem) applied to the analytic function F 2 , the identity 

⑺ ^ 广 (F(e ie )) 2 d0 = Q. 

If we suppose, as above, that / is real-valued, then by considering the 
real part of A(F(e i0 )) 2 , which is (f(e td )) 2 — (f(e td )) 2 , we immediately 
get (6). What became clear to Riesz is that when we replace F 2 by F 2k 
in the above, with k a positive integer, and again consider its real part, 
the boundedness of / / in L p , where p = 2k follows. Similar but more 

involved arguments worked for all p, 1 < p < oo. 

Here, once again, the Riesz interpolation theorem can play a crucial 
role. We will present these ideas below in the context where the unit disc 
is replaced by the upper half-plane. 

2 The Riesz interpolation theorem 

Suppose (po y qo) and are two pairs of indices with 1 < ^ oo, 

and assume that 

||T(/)|| l ,o < MoII/IIlpo and ||T(/)|| L91 < 

where T is a linear operator. Does it follow that 

\\T{f)\\ L ^ < M ||/|| Lp ， for other pairs (p, q)l 

We shall see that this inequality will hold with values of p and q de¬ 
termined by a linear expression involving the reciprocals of the indexes 
Po, Ply Qo and q[ (Linearity in the reciprocals of the exponents already 
arises in the relation 1/p + l/p f = 1 of dual exponents.) 

The precise statement of the theorem requires that we fix some no- 
tat ion. Let (X, /i) and (y, v) be a pair of measure spaces. We shall 
abbreviate the L p norm on (X,/i) by writing ||/||lp = ||/||lp(x ， ^0，and 
similarly for the L q norm for functions on (y, dv). We will also con¬ 
sider the space L Po + L Pl that consists of functions on (X, /i) that can 
be written as /o + /i，with fj € L Pj (X, /i), with a similar definition for 
+ . 

Theorem 2.1 Suppose T is a linear mapping from L Po - L Pl to L q ° + 
L qi . Assume that T is bounded from L Po to L q ° and from L Pl to L qi 

f ||T(/)|| l ，o <M 0 ||/||lpo, 

1 \\T(f)\\ L ^ 
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Then T is bounded from L p to L q ， 

||T(/)|| l , <M||/|| Lp , 

whenever the pair (p, q) can be written as 

1 1 - 1 t , 1 1 - 1 t 

一 = - + — and ~ = - + — 

v Po pi q qo q\ 

for some t with 0 < ^ < 1 . Moreover, the bound M satisfies M < . 

We should emphasize that the theorem holds for L p spaces of complex¬ 
valued functions because the proof of it depends on complex analysis. 
Starting with the strip 0 < Re(z) < 1 in the complex plane, our oper¬ 
ator T will lead us to an analytic function so that the hypotheses 
||T(/)||l 叩幺 M 0 ||/|| LPO and \\T(f)\\ Lqi < Mi||/|| L pi are encoded in the 
boundedness of $ on the boundary lines Re ㈤ 二 0 and Re(z) = 1, re¬ 
spectively. Moreover, the conclusion will follow from the boundedness of 
少 at the point t on the real axis. (See Figure 2.) 


Re(z) = 0 


i— 

Re(z) = 1 


1 

t 



Figure 2. The domain of the function $ 


The analysis of the function $ will depend on the following lemma. 

Lemma 2.2 (Three-lines lemma) Suppose 屯 (z) is a holomorphic func¬ 
tion in the strip S = {z G C : 0 < Re(z) < 1}, that is also continuous 
and bounded on the closure of S. If 

Mo = sup |$( 2 t/)| and M\ — sup | 少 (1 + iy)\, 

then 

sup \^(t iy)\ < for all 0 < t < 1. 
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The term “three-lines” describes the fact that the size of $ on the line 
R e ( z ) = ^ is controlled by its size on the two boundary lines Re(z) = 0 
and Re( 2 ) = 1. The reader may note that this lemma belongs to the 
family of results of the Phragmen-Lindelof type that were discussed in 
Chapter 4， Book II. As with other assertions of this kind，it is deduced 
from the more familiar maximum modulus principle, and it is here that 
the global assumption that 少 is bounded throughout the strip is used. 
Notice, however, that the size of the assumed global bound of $ does 
not occur in the conclusion. (That some condition on the growth of $ is 
necessary is shown in Exercise 5.) 

Proof. We begin by proving the lemma under the assumption that 
Mo = Mi = 1 and sup 0<x<1 |$(x + iy)\ 0 as \y\ oo. In this case, 
let M = sup|$(z)| where the sup is taken over all z in the closure of 
the strip S. We may clearly assume that M > 0， and let 2 ^, 22 , ... be 
a sequence of points in the strip with |$(z n )| — M as n —^ 00 . By the 
decay condition imposed on $， the points z n cannot go to infinity，hence 
there exists zo in the closure of the strip, so that a subsequence of {z n } 
converges to zq. By the maximum modulus principle, zo cannot be in the 
interior of the strip, (unless $ is constant, in which case the conclusion is 
trivial) hence zo must be on its boundary, where |$| < 1. Thus M < 1, 
and the result is proved in this special case. 

If we only assume now that Mo = Mi = 1， we define 

$ e (z) = ^(z)e e ^ z2 ~ 1 \ for each e > 0. 

Since e e [( x+l2/ ) 2-1 ] = e € ( x2 - 1 -y 2 + 2zx y)^ we find that |$ e (z)| < 1 on the 
lines Re(z) = 0 and Re(z) = 1. Moreover, 

sup \^ e (x -f iy)\ — 0 as \y\ 00 , 

0<x<l 

since $ is bounded. Therefore, by the first case, we know that |$ e (z)| < 1 
in the closure of the strip. Letting e —^ 0, we see that |$| < 1 as desired. 

Finally, for arbitrary positive values of Mo and Mi, we let 杏 (z)= 
Mq~ and note that 少 satisfies the condition of the previous 

case, that is , 态 is bounded by 1 on the lines Re(z) = 0 and Re(z) = 1. 
Thus |$| < 1 in the strip, which completes the proof of the lemma. 

To prove the interpolation theorem, we begin by establishing the in¬ 
equality when / is a simple function, and it clearly suffices to do so with 
II/IIl P = i. Also, we recall that to show ||T/||l<? < ^||/||lp it suffices to 
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prove, by Lemma 4.2 in Chapter 1, that 

J{Tf)gdu <M||/|| LP ||^|| LqS 

where 1/g + l/q f = 1, and g simple with ||^|| L<Z / = 1. 

For now, we also assume that p < oo and q > 1. Suppose / € L p is 
simple with ||/|| LP = 1, and define 


fz = |/| 7(2) jyr where 7(z) 


1 — z z 


P 


Po Pi 


and 


9z — 9 


Z 


where 8(z) = q f 

9 V <i o 


1 — z z 


f ， 


with q\ q f 0 and q[ denoting the duals of g, qo, and qi respectively. Then ， 
we note that ft = /, while 


f WIzWlpq = 1 if Re(z) = 0 
\ IIAIIlpi =1 if Re(z) = 1. 

Similarly — 1 if Re(z) = 0 and ||^ 2 || l(? / = 1 ifRe(z) = 1, and also 

gt = g. The trick now is to consider 

Hz) = J(Tf z )g z du. 

Since / is a finite sum, f = a kXE k where the sets Ek are disjoint and 
of finite measure, then f z is also simple with 

Since g = ^bjXFj is also simple, then 

With the above notation, we find 

吨 ) 命态 ( yv ( XE j XF 々) ， 
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so that the function $ is a holomorphic function in the strip 0 < Re(z) < 
1 that is bounded and continuous in its closure. After an application of 
Holder^ inequality and using the fact that T is bounded on L Po with 
bound M 0 , we find that if Re( 2 ) = 0, then 

I 少 (：)| S II^/zIIl^oII^zI^^ < Mo ||/z ||LPO = M 0 . 

Similarly we find |$(z)| < M\ on the line Re(z) = 1. Therefore, by the 
three-lines lemma, we conclude that 少 is bounded by on the 

line Re( 2 ：) = t. Since ^(t) = f (Tf)gdu, this gives the desired result, at 
least when / is simple. 

In general, when f £ L p with 1 < p < cx), we may choose a sequence 
{fn} of simple functions in L p so that \\f n — /||lp — 0 (as in Exercise 6, 
Chapter 1). Since ||T (/ n )||Lg < M||/ n ||Lp, we find that T(f n ) is a Cauchy 
sequence in L q and if we can show that lim n ^ 00 T(f n ) = T(f) almost 
everywhere, it would follow that we also have ||T(/)||l<3 < M||/||lp. 

To do this, write / = /" + / L , where f u (x) — f(x) if |/(x)| > 1 and 
0 elsewhere, while f L {x) = f(x) if |/(x)| < 1 and 0 elsewhere. Simi¬ 
larly, set / n = fn + fn- Now assume that po ^ Pi (the case po > pi is 
parallel). Then po < p < pi, and since / € L p , it follows that f u € L Po 
and f L £ L Pl • Moreover, since / n — / in the L p norm, then > f u 

in the L Po norm and f L in the L Pl norm. By hypothesis, then 

T(/^) —> T(f u ) in L q ° and T(f^) T(f L ) in L qi , and selecting appro¬ 
priate subsequences we see that T(f n ) : =^(/n ) + ^(/n) converges to 
T(f) almost everywhere, which establishes the claim. 

It remains to consider the cases q = l and p = oo. In the latter case 
then necessarily po = Pi — oo, and the hypotheses ||T(/)||£ / «o ^ Mo||/||l^ 
and ||T(/)||l9i < ^i||/||l°° imply the conclusion 

by Holder^ inequality (as in Exercise 20 in Chapter 1). 

Finally if p < oo and g 二 1， then qo = q\ = 1, then we may take g z = g 
for all z, and argue as in the case when q > 1. This completes the proof 
of the theorem. 

We shall now describe a slightly different but useful way of stating the 
essence of the theorem. Here we assume that our linear operator T is 
initially defined on simple functions of X, mapping these to functions 
on Y that are integrable on sets of finite measure. We then ask: for 
which (p,q) is the operator of type (p, g), in the sense that there is a 
bound M so that 


⑻ 


\\T(f)\\ Lq < M||/|| L p, whenever / is simple? 
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In this formulation of the question, the useful role of simple functions is 
that they are at once common to all the L p spaces. Moreover, if (8) holds 
then T has a unique extension to all of L p , with the same bound M in (8 )， 
as long as either p < oo; or p = oo in the case X has finite measure. This 
is a consequence of the density of the simple functions in L p , and the 
extension argument in Proposition 5.4 of Chapter 1. 

With these remarks in mind, we define the Riesz diagram of T to 
consist of all all points in the unit square {(x,y) : 0 < x < 1, 0 < y < 1} 
that arise when we set x = 1/p and y — 1/q whenever T is of type (p, q). 
We then also define M XiV as the least M for which (8) holds when x : =1/p 
and y : 二 1/g. 


Corollary 2.3 With T as before: 


(a) The Riesz diagram of T is a convex set. 

(b) log M x , y is a convex function on this set. 

Conclusion (a) means that if (xo,t/o) — (l/Po 5 l/^o) and = 

(l/pi, l/qi) are points in the Riesz diagram of T, then so is the line seg¬ 
ment joining them. This is an immediate consequence of Theorem 2.1. 
Similarly the convexity of the function logM x>y is its convexity on each 
line segment, and this follows from the conclusion M < guar¬ 

anteed also by Theorem 2.1. 

In view of this corollary, the theorem is often referred to as the “Riesz 
convexity theorem.” 


2.1 Some examples 


Example 1. The first application of Theorem 2.1 is the HausdorfF- Young 
inequality (3). Here X is [0,27r] with the normalized Lebesgue measure 
d9/(2ir) ) and Y = Z with its usual counting measure. The mapping T is 
defined by T(f) = {a n }, with 


a 


2丌 



f(9)e~ ine d0. 


Corollary 2.4 //1 < p < 2 and 1/p + 1/g 二 1 ， then 


\\TU)\\l<i{Z) < ||/||lp([0,27t]). 

Note that since L 2 ([0,27r]) c 27r]) and L 2 (Z) C L°°(Z) we have 

L 2 ([0, 2tt]) + l 1 ([0, 2tt]) = L 1 ([0, 2tt]), and also L 2 (Z) + L°°(Z) = L°°(Z). 
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The inequality for p 0 = = 2 is a consequence of Parseval’s identity, 

while the one for pi = 1, q 1 = oo follows from the observation that for 
all n, 

|a n | g 去 / |/(0)| d6. 

Thus Riesz’s theorem guarantees the conclusion when l/p = + 

1/q = ( - 卞 ) for any t with 0 < ^ < 1. This gives all p with 1 < p < 2, 
and q related to p by 1/p 4 - 1/q = 1. 

Example 2. We next come to the dual Hausdorff-Young inequality (4). 
Here we define the operator T f mapping functions on Z to functions on 
[0, 2 tt] by 

OO 

n= — oo 

Notice that since L P (Z) C L 2 (Z) when p < 2, then the above is a well- 
defined function on L 2 ([0, 27 t]) when {a n } E L p (Z), by the unitary char¬ 
acter of Parseval’s identity. 

Corollary 2.5 If 1 < p < 2 and 1/p + 1/g = 1 ， then 

ll r/ ({ a n})||L«?([0,27r]) < \\Wu}\\lp(Z)- 

The proof is parallel to that of the previous corollary. The case po = 
go = 2 is, as has already been mentioned, a consequence of ParsevaFs 
identity, while the case pi = 1 and qi = oo follows directly from the fact 
that 

OO 

n= —oo 

An alternative proof of this corollary uses Corollary 2.4 as well as The¬ 
orem 4.1 and Theorem 5.5 in the previous chapter. 

Example 3. We consider the analog for the Fourier transform. Here 
the setting is M. d and the L p spaces are taken with respect to the usual 
Lebesgue measure. We initially define the Fourier transform (denoted 
here by T) on simple functions by 

T(m)= [ f{x)e- 2 ^<dx. 

JR d 
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Then clearly, ||r(/)|| i^oo < ||/|| 乙 l, and r has an extension (by Proposi¬ 
tion 5.4 in Chapter 1 for instance) to L 1 (R d ) for which this inequality 
continues to hold. Also, T has an extension to L 2 (R d ) as a unitary 
mapping. (This is essentially the content of Plancherel’s theorem. See 
Section 1, Chapter 5 in Book III.) Thus in particular ||T(/)|| L2 < ||/||l 2 , 
for / simple. 

The same arguments as before then prove: 

Corollary 2.6 // 1 < p < 2 and 1/p 1/q = 1, then the Fourier trans¬ 

form T has a unique extension to a bounded map from L p to L q ， with 

nn/)iiL, < ii/Ulp. 

We summarize these results by describing in Figure 3 the Riesz dia¬ 
grams for each of the above versions of the Hausdorff-Young theorem. 
The three variants are as follows: 

(i) The operator T in Corollary 2.4: the closed triangle I. 

(ii) The operator T ( in Corollary 2.5: the closed triangle II. 

(iii) The operator T in Corollary 2.6: the line segment joining (1,0) to 
(1/2,1/2), that is, the common boundary of these two triangles. 

( 1 , 1 ) 


1 /^ 


( 0 , 0 ) i/p ( 1 , 0 ) 

Figure 3. Riesz diagrams for the Hausdorff-Young theorem 


More precisely, the results above guarantee the inequality for the seg¬ 
ment joining (1, 0) to (1/2,1/2) in each of the three cases. If we use the 
trivial inequality ||/||^i < ||/||^°° in Example 1 above, we get that the 
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point (0,0) also belongs to the Riesz diagram of T, yielding the closed 
triangle I. Similarly, because ||7 1/ ({a n })||L o ° < ||{a n }||^i, we obtain the 
triangle II for Example 2. Finally, we should note that in Example 3, 
the Fourier transform, the Riesz diagram consists of no more than the 
segment joining (1,0) to (1/2,1/2). (See Exercises 2 and 3.) 

Example 4. Our last illustration is Young’s inequality for convolutions 
in It states that whenever / and g are a pair of functions in L p and 
L r respectively, then the convolution 

(/* 々 )( 工 ） =/ — y)g[y) dy 

JR d 

is well-defined (that is, the function f(x — y)g(y) is integrable for almost 
every x), and moreover 

⑼ 11/< ||/||lp||^||l^ 

under the assumption that l/g = l/p+l/r— 1， (with 1 < ^ < oc). One 
proof of this has been outlined in Exercise 19 of the previous chapter. 
Here we point out that it is also a consequence of the similar special cases 
corresponding to p = 1, and p the dual exponent of r. In fact it suffices to 
prove (9) for simple functions / and 仏 and then pass to the general case 
by an easy limiting argument. With this in mind, fix 仏 and consider 
the map T defined by T(f) = f # g. We know (see Exercise 17 (a) in 
Chapter 1, where the role of / are g are interchanged) that \\T(f)\\Lr < 
M||/|| l i, with M = ||c/||l-. Also by Holder’s inequality, ||T , (/)|| L oc < 
M||/|| Lr /, where 1/r’ + 1/r = 1. Now applying the Riesz interpolation 
theorem gives the desired result. 

There is of course the parallel situation of the periodic case. For ex¬ 
ample, in one dimension, taking the functions with period 2 丌 , the con¬ 
volution of / and g is defined by 

(/* 9){0) = ^ J f( e - 咖 M 

If we set L p = L p ([0, 27 t]) with the underlying measure d6/ ( 27 t), then one 
has again ||/ * g\\L<i < ||/|| lp || 夕 IU r , but automatically in a larger range 
because ||^||^ < ||^|| L r, whenever f < r. 

The Riesz diagrams are described as follows (Figure 4): 

The solid line segment joining (1 — 1/r, 0) to (1,1/r) represents Young’s 
inequality for R d . The closed (shaded) trapezoid represents the inequal¬ 
ity in the periodic case. 
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( 1 , 1 ) 



Figure 4. Riesz diagrams for T(f) = / * 分 ， with g ^ L r 


3 The L p theory of the Hilbert transform 

We carry out the theory of the “conjugate function,” alluded to earlier 
in Section 1, but we do it in the parallel framework where the unit circle 
and the unit disc are replaced by R and the upper half-plane = 
{z = x iy^ a: G M, y > 0}, respectively. While the technical details of 
the proofs are a little more involved in the latter context, the resulting 
formulas are more elegant and their form leads more directly to important 
generalizations in higher dimensions. 


3.1 The L 2 formalism 

We begin by setting down the basic formalism connecting the Hilbert 
transform and the projection operator arising from the Cauchy integral. 
Starting with an appropriate function / on 1R we define its Cauchy inte¬ 
gral by 


Im ⑷ > 0 ， 

For the moment we restrict ourselves to / in L 2 (R). Then of course the 
integral converges for ^\\ z — x -\-iy with y > 0, (because l/(t — z) is in 
L 2 (M) as a function of t) and F(z) is holomorphic in the upper half-plane. 
We can also represent the Cauchy integral F in terms of the L 2 Fourier 


⑽ F[z) = cu)iz) = ^：J 
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transform / of / as 3 

(11) F(z) = f°° f(Oe 2n ^ lm(z) > 0. 

Jo 

This integral converges because e~ 2ny ^ as a function of ^ is in L 2 (0, oc), 
for y > 0. The above representation comes about because of the formula 


( 12 ) 



2niz$ 


di 


2niz 1 


which holds for Im(:) > 0. (For more details about these assertions, and 
their connection to the Hardy space H 2 , see Section 2, Chapter 5 in 
Book III.) 

As is clear from (11) and Plancherel’s theorem, one has F(x + iy ) — > 
P(/)(x), as y — 0, in the L 2 (R) norm with 





f(MOe 2nlxC d^ 


and x the characteristic function of (0, oo). Thus P is the orthogonal 
projection of L 2 (M) to the subspace of those / for which /(() = 0 for 
almost every ( < 0. So as in (5) of Section 1, one is led to define the 

Hilbert transform H by 

(13) H(f)(x) = r 

J-oc 1 

Some elementary facts, following directly from the definitions of P and 
H, are worth noting: 

• P = |(/ 4 - iH), where I is the identity operator. 

• H is unitary on L 2 , and H o H = H 2 = —I. 

In other words, ||i?(/)||L 2 = II/IIl 2 , and H is invertible with H~ l = —H. 

We now come to the important realization of the Hilbert transform as 
a “singular integral.” It can be stated as follows. 

Proposition 3.1 // / E L 2 (R) then 

(14) = lim - [ f{x~t)~. 

7T J\ t \>e ^ 


3 The Fourier transforms in the definitions below are taken in the L 2 sense, via 
Plancherel’s theorem. 
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That is, with H € (f) denoting the integral on the right-hand side above ， 
we have H e (f) G L 2 (M) for every e > 0, and the convergence asserted 
in (14) is in the L 2 (R) norm. 

First, we make a few observations. Note that with z = x + iy ， then 
(15) = Vy{x) 4 - iQy{x) 

K 17TZ y 

where 

and ⑽卜十^) 


are called the Poisson kernel and conjugate Poisson kernel, respec¬ 
tively. Then because of (10), (11) and (15) 


( 16 ) £° me 2niz ^ di^ 1 - [(/ * Vy){x)+i{f^ Q y )(x)\, 

where (/ * V y ){x) — f f(x — t)V y (t)dt = f f(t)V y (x — t) dt, with simi¬ 
lar formulas for / * Q y , 

Next define the reflection ip p 〜 by ^p"{x) = ip(—x)^ and observe 

that (/ * Vy ) 〜 =/" * Vy, while (/ * Q y ) 〜二 _(/〜 * Q y ), since V y and 

一 - - 

Q y are respectively even and odd functions of x. Also (/〜）=(/) 〜. 
Therefore using (16) with / and / 〜 we then obtain 


(17) 


(f^Vy)(x) = J^fiOe^e-^d^ 


As a result, we obtain that the Fourier transforms of V y and Q y (taken 
in L 2 ) are given by 



Vy{0 = e - 2 _l 

c -2ny\e\ sign(Q 
i 



With this we turn to the proof of the proposition. We note, by (13), 
(17), (18), and Plancherel’s theorem, that / * Q e —> H(f) in the L 2 norm, 
as 6 —^ 0. Now consider 


丌 




f{xQe)(x) = H e (f)(x)-(f^ Q e )(x). 


This difference equals / * A e , where 


△e ㈤ 


Qe{x), 

Qe{x), 


for \x\ > e 
for |x| < 6. 
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It is important to observe that A e (x) = e~ l Ai(e~ 1 x), while |Ai(x)| < 
A/(l + x 2 )，since l/x - x/(x 2 + 1) = 0(l/x 3 ), if \x\ > l. 4 In particular 
is integrable over M and the family of kernels A e (a:) satisfies the 
usual size conditions for an approximation to the identity, 5 but not the 
condition f A e (x) dx = 1. Instead f A e (x) dx = 0, for all e ^ 0, because 
A e (x) is an odd function of x. As a consequence 

(19) / * A € —^ 0 in the L 2 norm, as e — 0 ， 

and this gives that H e (f) — H(f) in the L 2 norm, as e —> 0. 

We recall briefly how (19) can be proved. First 

(/* A e )(x) = J f(x - t)A e (t)dt = J (f(x — t) - dt 

= J {f{x - et) - f{x))Ai (t) dt. 

Then by Minkowski’s inequality 

II/* A e || L 2 < J \\f(x - et) - f(x)\\ L 2 \Ai(t)\dt. 

Now, the integral tends to zero with e by the dominated convergence 
theorem. This is because ||/(x — et) — /(^)||^2 < 2 ||/||^ 2 , and \\f(x — 
et) — /(x )||^2 — 0 as e — 0 for each t. (For the continuity of the L 2 
norm used here, see Exercise 8 in Chapter 1.) 

Remark. The above argument shows also that \\H € (f )\\ L 2 < A||/|| l2 
with A independent of e and f, 

3.2 The L v theorem 

With the elementary properties of the Hilbert transform established we 
can now turn to our goal: the theorem of M. Riesz. It states that the 
Hilbert transform is bounded on 1 < p < cx). One way to formulate 
this is as follows. 

Theorem 3.2 Suppose 1 < p < oo. Then the Hilbert transform H, ini¬ 
tially defined on L 2 n L p by (13) or (14 )， satisfies the inequality 

(20) \\H(f)\\ LP < A p \\f\\ LP , whenever f G L 2 (1 L' 

4 We remind the reader of the notation f{x) = 0(^(x)), which means that |/(x)| < 
C\g(x)\ for some constant C and all x in a given range. 

5 A discussion of approximations to the identity can be found, for instance, in Book III, 
Section 2 and Exercise 2 of Chapter 3. 
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with a bound A p independent of f. The Hilbert transform then has a 
unique extension to all of L p satisfying the same bound. 6 


To have a better appreciation of the nature of this theorem it may 
help to see why the conclusions fail for p = 1 or p = oo. For this, an 
explicit calculation does the job. Let I denote the interval (—1 ， 1), and 
f = be the characteristic function of that interval. Now / is an even 
function, so its Hilbert transform is odd, and in fact a simple calcu¬ 
lation gives H(f)(x) = lim e ^o H e (f)(x) = ^log |f^|. Hence H(f) is 
unbounded near x — —1 and x = 1， with mild (logarithmic) singularities 
there. However H(f)(x) ~ ^ as |x| —> oc, so it is obvious that H(f) 
does not belong to L 1 . 


It is also instructive to consider instead of / = X/，the odd function 
9(^) Xj{ x ) — where J = (0,1). Then the Hilbert transform 


of g equals H(g)(x) = ^ log x 2 _ 1 ， and is an even function. While H(g) 
is still unbounded (with mild logarithmic singularities at —1，0 and 1 )， 



9(x) 



Figure 5. Two examples of Hilbert transforms 


6 For the general extension principle used, see Proposition 5.4 in Chapter 1. 
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There is a nice lesson here whose significance will be clear at several 
stages later on: namely, if / is (say) a bounded function with compact 
support on R, then H(f) is in if and only if / f(x)dx = 0. (See 

Exercise 7.) 

3.3 Proof of Theorem 3.2 

The main idea of the proof was already outlined at the end of Section 1 in 
the context of Fourier series and the corresponding theorem for the con¬ 
jugate function. While this proof, which depends on complex analysis, 
is elegant, its approach is essentially limited to this operator and can¬ 
not deal with the generalizations of the Hilbert transform in the setting 
of M d . The “real-variable” theory of those operators will be described in 
Section 3 of the next chapter. 

We turn to the proof of the Theorem 3.2, and in preparation we invoke 
two technical devices. The first is very simple and is the realization that 
it suffices to prove the theorem for real-valued functions, from which 
its extension to complex-valued functions is immediate (with a result¬ 
ing bound which is not more than twice the bound A v for real-valued 
functions). 

The second device depends on the use of the space Cq°(R) of indefi¬ 
nitely differentiable functions of compact support. There are two useful 
facts concerning this space. First, it is dense in L P (1R), and more particu¬ 
larly, if / G Z/ 2 fl Z / p ， with p < oo : there is a sequence {/ n } with f n G C§°, 
and / n —> / both in the L 2 and L p norms. (This follows from the argu¬ 
ment to solve Exercise 7 in Chapter 1 as well as the references therein.) 

For our purposes, a particularly helpful observation is that whenever 
/ € Cq°(R) then its Cauchy integral F(z) — dt extends as a 

continuous function on the closure of the upper half-plane, is bounded 
there, and moreover satisfies the decay inequality 

M 

( 21 ) |-^(^)| < - - I z = x + iy,y>0, 

for an appropriate constant M. The simplest way to prove this is to 
use the Fourier transform representation (11). Then the rapid decrease 

a 

at infinity of f shows that F is continuous and bounded in the closed 
half-plane R^_. Moreover the smoothness of / lets us integrate by parts, 
giving 


►OO 




2niz 


d{eS 


m ^ 


2niz 
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As a result, \F(z)\ < Mo/|^|, so together with the boundedness of F 
the estimate (21) is established. Notice also that the continuity of F 
with (11), (16) and (17) yields 

( 22 ) 2F(x) = 2 lim F(x + iy) = f(x) + iH(f)(x). 

It is also important to remark here that if / is real-valued (as we have 
assumed), then by (14) the Hilbert transform H(f) is also real-valued. 

With these matters out of the way, the main conclusions can be ob¬ 
tained in a few strokes. 

Step 1 : Cauchy y s theorem. We see first that 

pOO 

(23) I (F(x)) k dx 0 , whenever k is an integer, k >2. 

J —OO 

Indeed, if we integrate the analytic function (F(z)) k over the contour 7 
in the upper half-plane consisting of the rectangle (see Figure 6 ) whose 
vertices are R + ze, R + iR, —R + ii?, and —R + ie, then by Cauchy’s 
theorem J^(F(z)) k dz = 0 . Letting e — 0 and R oc, also taking into 
account the continuity of F and the decay ( 21 ) then gives (23). (Note 
also that by ( 21 ), we have H(f) G L p for all p > 1 .) 


_R -f- iR R -f- iR 



Figure 6 . The rectangle of integration 7 


We now exploit (23). Observe that when fc = 2 , if we take the real 
parts of this identity (using that / and H(f) are real-valued), we have 
/ 二 5(/ 2 — (i?/) 2 ) dx = 0. This is essentially the unitarity of H on L 2 
that we mentioned previously. 

Next we consider other values of fc > 2 , those when k is even k — 2L 
(When k is odd, the identity (23) does not have an immediately useful 
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consequence.) Suppose, for example, that fc = 4. Then the real part 
of (23) gives us 

fdx-6 j f\Hf) 2 dx + J{Hf) 4 dx = 0. 


As a result, 


J (Hf)^ dx <6 J f 2 (Hf) 


2 


dx < 6 


/ 4 dx\ (J (Hf)^dx 


1/2 


the last majorization following by Schwarz’s inequality. Hence 


J W) 4 dx) 


1/2 / f X 1/2 

<6( / / 4 dx 


which means 


I|^(/)IIl4<6 1 / 2 ||/|| L 4. 


In the same way, if we take p — 2^, with f an integer > 1, we obtain 


(24) 


||^(/)||lp<^||/||lp, v 二 1 


Indeed, the real part of (/ + iH(f)) 2e is 


t 


y^J 2r (Hf) 2i ~ 2r c r , where c r 二 （一1 广 r (《)，r = 0，1，…彳. 


r=0 


Hence 


I 

J(Hf) 2 i dx<Y,a r J 产 (H f) 

r=l 


2£-2r 


dx. 


21 21 


with a r = d). Now Holder^ inequality (with dual exponents 募， 2 t- 2 r 
shows that 


f 2r (Hf) 2£ ~ 2r dx<\\f\\l r ^H(f)\ 


2r || jJ( /M|2^-2r 

Lp ， 


with p — 2L Thus 




2i-2r 

Lp 
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Note that this inequality is jointly homogeneous of degree 2£ in ||/||lp 
and \\H(f)\\LP. Moreover the right-hand side is of degree at most 2£ — 
2 in \\H(f)\\LP. Upon normalizing / so that = 1， and setting 

X = \\H(f)\\LP we have X 2i < o Jr X 2( -~ 2r . Now either X < 1 or 

X > 1. In the second case, then X 2i < a r )X 2 ^~ 2 . As a result 

X 2 < a r ^ 2 2 ^. In either case X < 2 e , and therefore (24) is proved 

with A p = 2 p / 2 . 

To carry out the next step we extend the basic inequality (24)，proved 
for / E C§°, to f that are simple functions. Recall that we have already 
defined H(f) whenever / is in L 2 , and in particular if / is simple. Next， 
since such / belongs to L 2 PI we can find a sequence {/ n }, with f n E 
Cq°, so that f n -^f both in the L 2 and L v norms. As a result, {H(f n )} 
are Cauchy sequences in both the L 2 and L p norms, while H[f n ) — H(f) 
in the L 2 norm. Thus (24) is established when / is simple. 

Step 2: Interpolation. Having proved (24) for simple functions and p 
even, we can apply the Riesz interpolation theorem once we have ex¬ 
tended H to complex-valued functions. But this is easily done by setting 
H(f\ + if 2 ) = H(fi) + 2 ^(/ 2 ), for /1 and /2 real-valued. Note that as a 
result, the inequality (24) extends to this case, but with A p replaced by 
2A P . (By a further argument we can show that the original bound A v 
holds in this case also. See Exercise 8.) 

With this in mind Riesz interpolation yields the inequality 

||^(/)||lp<^||/||lp 

for all p such that 2 < p < 2^, where i is any positive integer. This follows 
by taking p 0 = q 0 = 2, Pi = gi = 2£ and noting that if 1/p = (1 — t)/2 + 
t/(2£)^ then p ranges over the interval 2 < p < when t ranges over 
0 < t < 1. Since £ may be taken to be arbitrarily large, we get (20) for 
all 2 < p < cx) and / simple. 

Step 3: Duality. We pass from the case 2 < p < cx) to the case 1 < P < 
2 by duality. This passage is based on the simple identity 


(25) / (Hf)gdx = - / f(Hg)dx 

J —OO J — OO 

whenever / and g belong to L 2 (M) and are now allowed to be complex¬ 
valued. In fact this follows immediately from Plancherel’s identity (f,g)= 

八 

(/，々)， an d the definition (13), which can be restated as 

mo = 宇 / ⑹. 
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One can invoke the abstract duality principle in Theorem 5.5 of Chap¬ 
ter 1 or proceed directly as follows. Restricting attention to / and g 
simple, one has by Lemma 4.2 in the previous chapter, with 1 < p < 2, 

||^(/)||lp = sup [ H{f)gdx , 

9 J 

where the supremum is taken over g simple, with ||^||l«? < 1 , 1/p 1/q = 

1 . However, by (25) and Holder^ inequality, this is equal to 

sup [ fH(g)dx <sup\\f\\ L p\\H(g)\\ Lq < ||/||lp^, 

9 J 9 

using (20) for q in place of p, and noting that 2 < q < oo. 

Therefore (20) holds for all p, 1 < p < oc, for all simple functions /. 
The passage to all / G L 2 H L p ，and thus to the general result, is by now 
a familiar limiting argument. 


4 The maximal function and weak-type estimates 

Another important illustration of the occurrence of L p spaces is in con¬ 
nection with the maximal function f*. For appropriate functions / given 
on the maximal function /* is defined by 


r{x) = 





\f{y)\dy, 


where the supremum is taken over all balls B containing x, and m (as 
well as dy) denote the Lebesgue measure. 7 

It is a fact that /* plays a role in a wide variety of questions in analysis, 
and it is there that its L p inequality 


(26) II/Ilp <^ p ||/||lp, 1<P<oo, 


is of crucial interest. 

Before we come to the proof of (26) a few observations are in order. 
First, the mapping / h /* is not linear, but does satisfy the sub-additive 
property that /* < / x * -h / 2 *, whenever / = /【 + / 2 . 

Next, while (26) obviously holds for p = oo (with = 1), the in¬ 
equality for p = 1 fails. This can be seen directly by taking / to be 


7 An introduction to /*, and a complete proof of (27) below can be found for instance 
in Chapter 3 of Book III. 
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the characteristic function of the unit ball B, and noticing that then 
f*(x) > 1/(14- \x\) d . This function clearly fails to be integrable at infin¬ 
ity. The asserted inequality follows immediately from the fact that for 
each a: E R d the ball of radius 1 -|- \x\ centered at x contains B. There 
are also simple examples where the integrability of /* fails locally. (See 
Exercise 12.) 

There is nevertheless a very useful substitute for L 1 boundedness 
for /*. It is the weak-type inequality: there is a bound A (independent 
of /), so that 

(27) m{{x : /*(x) > a}) < ^||/||Li(M d ) 7 for all a > 0. 

We briefly recall the main steps in the proof of (27). If we denote by E a = 
{x: r(x) > o；}, then to obtain the above majorization for m(E a ) it 
suffices to have the same for m{K)^ where K is any compact subset of E a . 
Now, using the definition of /* we can cover ^ by a finite collection of 
balls , Bfsf with f B \f{x)\ dx > am(Bi)^ for each i. If we then 

apply a Vi tali covering lemma, we can select a disjoint sub-collection of 
these balls , B{ 2 ，…， Bi n with > 3~ d m(K). Adding the 

above inequalities over the disjoint balls then gives m(K) < 
which leads to (27). 

4.1 The L v inequality 

We turn to the proof of the L p inequality for the maximal function. It 
is formulated as follows. 

Theorem 4.1 Suppose f E L p (M. d ) with 1 < p < oo. Then f* E L p {R d ) f 
and (26) holds, namely 

\\n\L P <A p \\f\\ LP . 

The bound A v depends on p but is independent of f. 

Let us first see why /*(x) < oc, for a.e. x, whenever / E L p • Observe 
that we can decompose / = /i + /oo? where fi(x) : =/(x) if \f{x)\ > 1 , 
and fi(x) = 0 elsewhere; also foo(x) = fix) if \f(x)\ < 1 and /oo(^) = 0 
elsewhere. Then f Y E L 1 and /oc E L°°. But clearly /* < /* +/^ < 
+ 1 , since |/oo(^)| ^ 1 everywhere. Now from (27) (with /i in place 
of /), we see that /f is finite almost everywhere. Thus the same is true 
for /*. 



72 


Chapter 2. L p SPACES IN HARMONIC ANALYSIS 


The proof that /* G L p relies on a more quantitative version of the 
argument just given. We strengthen the weak-type inequality (27) by 
incorporating in it the L°° boundedness of the mapping / /*. The 

stronger version states 

r 

(28) m{{x : f*{x) > a}) < — / |/| dx^ for all a > 0. 

a J\f\>cc/2 

Here A f is a different constant; it can be taken to be 2A. The improve¬ 
ment of (27), (except for a different constant, which is inessential), is 
that here we only integrate over the set where \f{x)\ > a/2, instead of 
the whole of R d . 

To prove (28) we write / = /i + foo, where now fi(x) = f(x), if \f{x)\ > 
a/2, and foo{x) = f{x) if \f{x)\ < a/2. Then /* < / x * + /^ < fi 4 - a/2, 
since |/oo(^)| ^ a/2 for all x. Therefore {x : /*(x) > a} C {x : / x * > 
a/2}, and applying the weak-type inequality (27) to f\ in place of / 
(and a/2 in place of a) then immediately yields (28), with A! = 2A. 


Distribution function 

We will next need an observation concerning the quantity occurring on 
the left-hand side of the inequalities (27) and (28), which we formulate 
more generally as follows. Suppose F is any non-negative measurable 
function. Then its distribution function, A(a)= 入尸 ㈣ is defined for 
positive a by 

A(a)= : m({x : F(x) > a}). 

The key point here is that for any 0 < p < oc, 

(29) f (F(x)) p dx= f X(a l/p )da, 

JRd Jq 

and this holds in the extended sense (that is, both sides are simultane¬ 
ously finite and equal, or both sides are infinite). 

To see this, consider first the case p — 1 . Then the identity is an 
immediate consequence of Fubini’s theorem, in the setting x R + , 
applied to the characteristic function of the set {(x, a) : F(x) > a > 
0 }. Indeed, integrating the characteristic function first in a then in x 

gives f Rd ^/ 0 F ( X > da^j dx, while integrating in the reverse order yields 

/ 0 °° m({x : F(x) > a}) da, and this shows (29) for p = 1. Finally, let 
G{x) = (F(x)) p , so {x : G(x) > a} = {x : F(x) > Using (29) for 

p = 1 (and G instead of F) then gives the conclusion for general p. 
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We also note that 


A(a) < — / F(x) dx, 
a 


which is Tchebychev’s inequality. In fact, 


F(x)dx> / F(x) dx < am({x : F(x) > a}), 

'R d J F(x)>a 

and this proves the assertion. One also sees, more generally, A(a) < 
^ f {F{x)) p dx for p>0. 

We now apply (29) to F(x) = /*(x), utilizing (28). Then 


(f*(x)) p dx— / A(a 1//p ) da 

'R d JO 


< M I a 
o 


I ■ |/| dx ) da. 

|/|>q 1 /p/2 


We evaluate the integral on the right-hand side by interchanging the 
order of integration. It then becomes 

r ( ,|2/(x)r 

A’ |/(x)| [ / a _1//p da 1 dx. 

JR d \Jo 


However, ifp > 1, a~ l ^ p da = a p t 1_1 / p , for all t > 0, (with a p — p/{p — 
1 )). So the double integral equals A f a p 2 p ~ l J Rd \ f(x)\ \ f(x)\ p ~ l dx^ which 
is ^4p||/||^p, with (A 1 ^ = A f a p 2 p ~ 1 ), and this gives (26), proving the the¬ 
orem. 

Note, as a result of the above proof, that the constant A v in (26) 
satisfies A v = 0(l/(p — 1)) as p — 1. 

Remark. The Hilbert transform H(f), like the maximal function /*, 
also satisfies a weak-type L 1 inequality, a result we will prove in a more 
general setting in the next chapter. In fact, this weak-type inequality 
will then be used to prove L v inequalities for the generalizations of the 
Hilbert transform to in much the same way as they are used above 
for the maximal function. 


5 The Hardy space 

We now come to the real Hardy space Hj(R d ), which plays a significant 
role as another substitute for L 1 (R d ), in the context where important 
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jj> inequalities for p 〉 1 break down at p = 1. This space is a Banach 
space that is “near” L 1 , and whose dual space also occurs naturally in 
many applications. Moreover, Hj; stands in sharp contrast to the space of 
weak-type functions considered above: the latter space cannot be made 
into a Banach space，nor does it have any bounded linear functionals. 
(See Exercise 15.) 

The space Hj ； (R d ) arose first for d ~ l in the setting of complex analy¬ 
sis as the “real parts” of the boundary values of functions of the complex 
Hardy space H v 、 when p = 1. The Hardy space H p ) in the version of the 
upper half-plane, consists of holomorphic functions F on for which 

pOO 

sup / \F(x + iy)\ p dx < oc, 
y〉o J —oo 

and whose norm ||F||hp, is defined as the p th -root of the quantity on the 
left-hand side of the above inequality. 8 

Now, it can be shown that whenever F G H p ， p < oo, then the limit 
Fq(x) = limy-.o F{ x + iy) exists in the L P (M) norm and in fact ||F||hp = 
||Fo||lp(R)- Moreover, when 1 < p < oo, Riesz’s theorem can be reinter¬ 
preted to say that 2Fq = / + iH(f) where / is a real-valued function in 
L P (W). Conversely, every element F E H p arises in this way. Thus, when 
1 < p < oo we see that the Banach space H p is the same, up to equiva¬ 
lence of norms as (real) L P (M). The equivalence breaks down at p 二 1, 
since the Hilbert transform H is not bounded on L 1 . This situation led 
to the original definition of H^(IR): the space of real-valued functions / 
that arise as 2Fo = f + iH(f) where F G H 1 . Equivalently, / G 
if and only if / € L 1 (R) and H(f), defined in an appropriate “weak” 
sense, also belongs to L l (R). (An outline of the proof of these assertions 
can be found in Problems 2, 7*，and 8*.) 

The notion of Hj; was later extended to M d , d > 1, and various equiv¬ 
alent defining properties were ultimately found. It turns out that the 
simplest of these to state, and the most useful in applications, is the 
definition in terms of decompositions into “atoms.” To this we now turn. 

5.1 Atomic decomposition of Hj 

A bounded measurable function a on is an atom associated to a ball 
B c M d , if: 

(i) a is supported in with |a(x)| < for all x\ and, 


8 The case p = 2 is treated in Section 2, Chapter 5 of Book III. 
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(ii) f Rd a(x) dx = 0. 

Note that (i) guarantees that for each atom a we have < 1. 

The space H^(M d ) consists of all L 1 functions / that can be written as 

(30) 

where the are atoms and the are scalars with 

OO 

(31) ^2 l^fcl < 

k=\ 

Observe that (31) insures that the sum (30) converges in the L 1 norm. 
The infimum of the values [I 入 fcl，taken over all possible decompositions 
of / of the form (30) is, by definition，the Hj ； norm of /, written as 

II/IIhi* 

One can then observe the following properties of H}: 

• With the above norm the space is complete, hence is a Banach 
space. If / belongs to Hj then / belongs to L 1 and H/Hl^r^) ^ 
||/||hi ； also obviously f f(x) dx = 0. 

• However, the above necessary conditions are far from sufficient to 
imply / G H》. 

• The significance of the cancelation condition (ii) was already indi¬ 
cated at the end of Section 3.2. Moreover, if one drops this can¬ 
celation property for atoms, then sums of the kind (30) represent 
arbitrary functions in L 1 (R d ). 

• However, in the opposite direction if / is any L p (R d ) function, 1 < 
p, (say) of bounded support that satisfies the cancelation condition 
f f(x) dx = 0, then / belongs to H[ 

Proofs of the first three assertions are outlined in Exercises 16, 17, and 18. 
The fourth assertion is the deepest of these. Its proof, which follows 
below, provides us with valuable insight into the nature of H}，and its 
ideas will be exploited in several circumstances later. 

We state the result mentioned above. 

Proposition 5.1 Suppose f G Z/ p (M d )，p > 1， and f has bounded sup¬ 
port. Then f belongs to Hj;(E d ) if and only if f Rd f(x) dx = 0. 
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Note that / is automatically in L 1 , by Holder’s inequality (see Proposi¬ 


tion 1.4 in Chapter 1 ), and the cancelation condition is necessary as has 
been pointed out. 

To prove the sufficiency we assume that / is supported in a ball B\ 


of unit radius, and that f B ^ \f{x)\ dx < 1. These normalizations can be 
achieved by a simple change of scale and multiplication of / by an ap¬ 
propriate constant. We next consider a truncated version of the maximal 


function /*. We define /t by 


内工) 


1 




sup 


MB) J B 


\f(y)\dy, 


where the supremum is taken over all balls B of radius < 1 that contain x. 
We note that under our assumptions we have 


(32) I P{x) dx < oo. 

JR d 

Indeed, P(x) 0 if a: # B 3 , where is the ball with same center as Bi, 
but with radius 3. This is because x 朱 Bj and if a: 6 B with the radius of 
B less than or equal to 1, then B must be disjoint from Bi, the support 
of /• Thus 



by Holder^ inequality. However the last integral is finite by Theorem 4.1, 
since clearly P(x) < 

Now for each a > 1, we consider a basic decomposition of / at “height” 
a ， carried out with respect to the set E a = {x : P{x) > a}. This is a 
variant of the important u Calderon-Zygmund decomposition.” It will be 
a little simpler to carry out the steps when d = 1 , and this we do first; 
we return to the general case d >2 immediately afterwards. The reader 
who is impatient with the technicalities of the next few pages may want 
to glance ahead to the lemma in Section 3.2 of the next chapter, where 
a more streamlined version of the decomposition appears. 

This decomposition allows us to write f — g -\-b where 


(33) 


< ca ， for an appropriate constant c , 9 


9 Here we continue the practice of using c, ci ， etc. to denote constants that may not 
be the same in different places. 
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and where b is supported in E a . In fact, since, as is easily seen, the set 
E a is open, we can write E a = (J /j, where Ij are disjoint open intervals, 
and we will be able to construct b so that 6 = ^ 6j, with bj supported 
on Ij and satisfying 

(34) J bj(x) dx = 0, for all j. 

The key observation used in this construction is 

(35) m (I ~) fj. - a, for a11 i- 

When Tn(Ij) > 1, the inequality (35) is automatic in view of our assump¬ 
tions that f \f{x)\ dx <1 and a > 1. Otherwise, writing Ij — {x\^X 2 ) 

we note that (35) follows because x\ E E&, and hence f^(xi) < a while 

/ 十 ㈤ 2 \f{x)\dx. 

As a result, if 

denotes the mean of / on Ij, then \rrij\ < a. Since 1 = XBg + 
we can write f = g + b with 

9 = fXEc a ^Y^ m ^ Xl ^ 

j 

and 

b = — m j)Xij — bj ， 

j j 


where the bj’s are defined by bj = {f — mj)xi^ and the x’s designate the 
characteristic functions of the indicated sets. Note that on we have 
/ 十 ⑷ < a, so that \f{x)\ < a for a.e. x on this set by the differentiation 
theorem. 10 Since the Ij are disjoint, (35) then guarantees that (33) holds, 
with c—1. The cancelation property (34) is also clear because 


bj(x) dx = (/(x) — rrij ) dx = m(Ij)(rrij — rrij) = 0. 


With the decomposition f = g + b given for each a, we now consider 
simultaneously all decompositions of this form for a = 2 fc , A: = 0,1, 2, •.. • 


10 See for instance Theorem 1.3 in Chapter 3 of Book III. 
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Thus for each k we can write f = g k + b k ， with \g k \ < c2 k , b k = 巧， 
where b k is supported on open intervals , which for fix k are disjoint, 
and moreover E 2k = {x : / f (x) > 2 fc } = Clj，while / 巧 (x)dx = 0. 

Now since b k is supported in the set E 2 k, and the sets E 2 k are decreas- 
ing with m[E 2 k) — 0, as /c — oc，we have that — > 0 almost everywhere, 
as h ~> oo. Thus f = lim/-_ >oo 9 a.e.，and 

/ 二 / + 

k=0 

However, 

/ +1 — g k = b k — b k+1 = b i +1 

3 i 3 

where A k - = — X^/ fc+1 c/ fc ^ +1 5 the last identity holds because each 

/ 2 fc+1 is contained in exactly one 7^. The are supported in the in¬ 
tervals 7^, and by the cancelation properties of and 6 ^ +1 , we have 
that f Aj(x) dx = 0. Also since \g k+1 — g k \ < c 2 fc+1 + c 2 fc = 3c2' and 
g k+l — g k — b k ~ b k+l , the disjointness of the intervals {Ij}j shows that 
1^1 < 3 c 2 fc . As a result we will see that the sum 

(36) 

hj 


will give us an atomic decomposition of /. In fact we set 


k _ 1 Ak 

3 m(/j c )3c2 fc 3 


A 》 = m(Ij)3c2 k 1 and f = g° -^2 k . A 》 a》. Now the a》are atoms (asso¬ 
ciated to the intervals Ij) while 


k 



= 3c^2 fc m ({/ f (x) >2 fc }). 
k=0 

However, because m({p(x) > a}) is decreasing in a, 

2 知 

2 k m({p(x) > 2 k }) <2 [ m({p(x) > a}) da, 

J2 k ~ l 
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and hence summing in k we find that - <00, because 

f m ({/ f (x) > a}) da = f f f {x)dx < 00 

Jo Jr 

as we saw by (29) and (32). Finally, g° is bounded and supported in B 3 , 
while f g°(x) dx = 0 because of the cancelation properties of / and A^. 
Hence g° is a multiple of an atom, and this yields that (36) is an atomic 
decomposition of /. 

To extend the result to general d we need to modify the argument just 
given in one point: the appropriate analog of the decomposition of the 
open set E a — {x : P{x) > a) into a disjoint union of open intervals 
is its decomposition into a union of (closed) cubes whose interiors are 
disjoint and so that the distance from each cube to is comparable to 
the diameter of the cube. 11 It is also helpful to take the cubes entering 
in this union to be dyadic cubes. These cubes are defined as follows. 

The dyadic cubes of the O th -generation are the closed cubes of side- 
length 1, whose vertices are points with integral coordinates. The dyadic 
cubes of the fc th -generation are the cubes of the form 2 _fc Q, where Q is a 
cube of the O th -generation. Notice that bisecting the edges of any dyadic 
cube of the fc th -generation decomposes it into 2 d cubes of the (k 4 - l) th - 
generation whose interiors are disjoint. Observe also that if Q\ and Q 2 
are dyadic cubes (of possibly different generations), and their interiors 
intersect, then either Qi C Q 2 , or Q 2 C Q\. 

The decomposition we need of an open set into a union of such cubes 
is as follows. 

Lemma 5.2 Suppose Q C~R d is a non-trivial open set. Then there is 
a collection {Qj} of dyadic cubes with disjoint interiors so that ft = 
Ugi Qj，and 

(37) diam(Qj) < d(Qj,Q c ) < 4 diam(Qj). 

Proof. We claim first that every point x G belongs to some dyadic 
cube Qx for which (37) holds (with Qx in place of Qj). 

Let S = d(x^ fi c ) > 0. Now the dyadic cubes containing x have diam¬ 
eters varying over {y/d2~ k }^ fc G Z. Hence we can find a dyadic cube 
Qx which contains x, with 5/4 < diam(Q^-) < 6/2. Now d(Q 王 , fi c ) < S < 
4 diam(Qx)) since x G Qx. Also 

d(Qx^ f^ c ) > ^ — diam(Q^) > 6/2 > diam(Q^), 


11 This kind of decomposition already arose in Chapter 1 of Book III. 
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thus (37) is proved for Q^. Now let Q be the collection of all cubes 
Q 王 obtained as x ranges over Q. Their union clearly covers but their 
interiors are far from disjoint. To achieve the desired disjointness select 
from Q the maximal cubes, that is, those cubes in Q not contained in 
larger cubes of Q. Clearly, by what has been said above, each Q is 
contained in a maximal cube and these maximal cubes necessarily have 
disjoint interiors. The lemma is therefore proved. 

With the above lemma, we can redo the decomposition of / in the 
setting d>2. The argument is essentially the same as before except for 
some small changes. For a > 1 , we apply the lemma to the open set E a = 
{ 工 ： / f (^) 〉 therefore wc have a decomposition / = g + b, with q — 
fXE- + J2T=1 m jXQj, and 6 〜，with 卜 =(f — mj)xQ y Now as 

in the case d = 1 we see that \rrij\ < ca. In fact, Jg |/| dx < f B |/| dx 
for any ball B D Qj. We choose B so that it contains a point x of E^. 
We can do this with a ball whose radius is 5 diam(Qj), since d(Q, E^) < 
4 dmm(Qj). If we choose such a ball and it has radius < 1 (that is, 
diam(Qj) $ 1/5), then 

^| B l/(x)|dx</t(x)<a, 

and hence \rrij\ < C\a where m(B)/m(Qj) = c\. (The ratio Ci is inde¬ 
pendent of j). Otherwise, if diam(Q J ) > 1/5, the inequality \rrij\ < C 2 a 
is automatic (with C 2 independent of j), since f \f(x)\ dx < 1 by assump¬ 
tion, and a > 1. In either case, therefore \rrij\ < ca. Next, since each 
dyadic cube arising in the decomposition of {x : P(x) > 2 fc+1 } must be 
a sub-cube of a dyadic cube arising for {x : P(x) > 2 k } we can proceed 
as before to obtain 

f = 9°^ A J 

with Aj supported in the cube and {x : P{x) > 2 k } 二 u •谇 

As a result we can write 二制 where A’ = d2 k m(Qj) and a》are 
atoms associated to the balls Bj where the ball Bj is defined to be, 
for each k and j, the smallest ball containing the cube Q^. Note that 
m(Bj)/m(Qj) is independent of k and j. (See Figure 7.) 

Finally, since 

E 2 ^(《) = E 2 k m({x : p(x) > 2 k }) < oo, 
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as above, we have established the atomic decomposition of /, concluding 
the proof of the proposition. 


5.2 An alternative definition of Hj; 

A nearly immediate consequence of Proposition 5.1 allows us to recast 
the atomic decomposition of Hj; in a more general form. For any p with 
p > 1 we define a p-atom (associated to a ball B) to be a measurable 
function a which satisfies: 

(i ; ) a is supported in and ||a||Lp < 

(iiO a{x) dx = 0. 

We reserve the terminology of “atom” for the atoms defined previously 
in Section 5.1, which correspond to p-atoms for p = oo. Note that any 
atom is automatically a p-atom. 

Corollary 5.3 Fix p > 1. Then any p-atom a is in H;. Moreover there 
is a bound c v , independent of the atom a 7 so that 

(38) ||ci||hi < c v 

Note that the proof below yields that c p = 0(l/(p — 1)) as p — 1. Also, 
the requirement p > 1 for the conclusion of the corollary is necessary, as 
can be seen by using the reasoning in Exercise 17. 

Proof. One can rescale a p-atom a, associated to a ball B of ra¬ 
dius r, by replacing a by a r , with a r (x) = r d a(rx). Then clearly a r (x) 
is supported where rx e B, that is, x G \：B = B r and the latter ball has 
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radius one. Also since m(B r ) = r~ d m(B) and ||a r ||i> = r ^-rf/p|| a || LPi we 
have ||a r ||i> S m(B r )- 1+1//p . Thus a r is a atom for the (unit) ball B r . 
Moreover, as has already been observed ||r d /(rx)||Hi = ||/||hJ_, for every 
r > 0. Thus (38) has been reduced to the case of p-atoms associated to 
balls of unit radius. Observe that automatically for such p-atoms one 
has f |a(a:)| dx < 1, therefore we see that we find ourselves exactly in the 
setting of the proof of Proposition 5.1 with f(x) = a(x). In fact one notes 
that what is proved there amounts to (38), with the constant c p incor¬ 
porating the bound A v in (26) for the maximal function, since the calcu¬ 
lation for / 卞 (a:)d:r used to establish (32) shows that this quantity is 
bounded by cA p ||/||lp. We have already noted that A v — 0(l/(p — 1 )) 
as p — 1 . Because / = a, the proof of (38) is complete. 

As a result, if / 二 ^1^=1 with p-atoms a^, and ^ |A^| < oc, then 
/ is in Hj and 

OO 

ii/Hhi < c p y^^ iA fc . 

k = l 

Conversely, whenever / G Hj；, it has a decomposition with respect to 
(p = oo) atoms and therefore has such a decomposition with respect to 
p-atoms. We may summarize this as follows. 

In defining Hj via (30) and (31)， we may replace atoms by 
p-atoms ，p > 1 ， and obtain an equivalent norm. 

5.3 Application to the Hilbert transform 

The result below exemplifies the role of the Hardy space Hj as an im¬ 
provement over the space L 1 . In contrast with the failure of the bound¬ 
edness of the Hilbert transform on L 1 , we have that it is bounded from 
to L l . 

Theorem 5.4 If f belongs to the Hardy space H^ ； (R )， then H e (f) G 
L 1 (R), for every e > 0. Moreover H e (f) (see (14)) converges in the L 1 
norm, as e 0. Its limit, defined as H(f)，satisfies 

II^(/)IIl i (r) < ^II/IIhj.(r)- 

Proof. The argument below illustrates a nice feature of H^(R): to 
show the boundedness of an operator on Hj； it often suffices merely to 
verify it for atoms, and this is usually a simple task. 

Let us first see that for all atoms a, we have 


(39) 


丑 e ⑷ IlLUR) S 义， 
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with A independent of the atom a and e. Indeed，we can avail ourselves of 
the translation-invariance and scale-invariance of the Hilbert transform 
to simplify matters even further by restricting ourselves in proving (39) 
for the case of atoms associated to the (unit) interval I = [—1/2,1/2]. 
This reduction proceeds, on the one hand, by recalling that if a r (x)= 
ra(rx), then H(a r )(x) — rH{a){rx)\ that a r is an atom associated to the 
interval I r = ^1 whenever a is supported in /; and that ||rF(rx)||/ / i(]R)= 
||F(x)||/ / i(K), whenever F G L 1 . On the other hand，the translations 
f(x) ^ f(x + /i)，G M，commute with the operator H, as is evident 
from (14); also translation clearly preserves atoms and the radii of their 
associated balls. 

Thus in proving (39) we may assume that a is an atom associated to 
the interval \x\ < 1/2. We will estimate H e (a)(x) differently, according 
to whether |x| < 1, (x belongs to the “double” of the support of a), or 


x 


> 1. In the first case, we have 


H e {a)(x)\dx < 2 1/2 ( [ \H(a e )(x)\ 2 dx\ < 2 1/2 ||// e (a)|| L 2 

l^i<i V^kl<i / 

< c||a|| L 2 二 c， 

using the Cauchy-Schwarz inequality and the L 2 theory studied earlier. 
Next when \x\ > 1 we write (for small e) 


麵 




a ⑴; 


dt 



a ⑷ 


X _ 11 > € 


X 


X 


dt. 


since f a(t) dt = 0. Hence if \x\ > 1, then |/f(a e )(a:)| < c/x 2 because 
-^7 — -I < ^2 when |x| > 1 and |^| < 1/2， and |a(^)| < 1. Therefore 
i^ x |>i \H e (a)(x)\dx < 2c，and this proves (39) for atoms associated to 

the interval [—1/2,1/2], and thus for all atoms. 

At the same time, the inequality \H e (a)(x)\ < c/x 2 when \x\ > 1, and 
the convergence in the L 2 norm, guaranteed by Proposition 3.1, shows 
that H e (a) converges in the L 1 norm to H(a), as e 0, for every atom a. 


Now if / = Ylh-i is an H^； function with the indicated atomic 
decomposition, then by (39) 




fc=i 
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and if we take the infimum over atomic decompositions, we obtain 
(40) \\H e (f)\\ L ^ < ^II/IIhi, for every / E Hj. 

Next, let f N = Y ^ k =\ 入 知叫， so that f = f N (f - f N ). Now since f N 
is a finite linear combination of atoms, it is itself a constant multiple of 
an atom. So we know that converges in the L 1 norm as e —> 0. 

Also, 

11^(/)- ^ 2 (/)Hl! < ii 〜 c/w) - /u/aoiip+ mf- /ivii H i. 

However, ||/ — ― > 0， as TV — oo. Thus given 5 > 0 and choosing 

first N sufficiently large, then with both t\ and e 2 sufficiently small, 
we get that \\H €l (/) — H €2 (Z)!^ 1 < which shows that H e (f) converges 
in the L 1 norm. The conclusion asserted by the theorem then follows 
from (40), and the proof is complete. 

Remark. A more elaborate form of the argument given above shows that 
in fact the Hilbert transform maps the Hardy space Hj; to itself. This is 
outlined in a more general setting in Problem 2 of the next chapter. 

6 The space and maximal functions 

The real Hardy space Hj also leads to interesting insights regarding 
maximal functions. The fact that this might be the case was already 
suggested by the use of /* (more precisely, its truncated version /” in 
the proof of Proposition 5.1. In parallel with what we saw for the Hilbert 
transform, our goal will be to find a suitable maximal function that maps 
Hj to Z/ 1 . In doing this we must keep in mind the following points. 

First, neither /* nor can be used as such because by their definitions 
both f* and /t involve / only through its absolute value, and therefore 
cannot take into account the cancelation properties of / that enter to 
exploit the fact that / G Hj,. 

Second, even if one removed the absolute values in the definitions of 
these maximal functions this would not be enough, because the cut-off 
functions involved (the characteristic functions of balls) are not smooth. 

It is the notion of nice “approximations to the identity,” and the re¬ 
sulting family of convolution operators that lead us to the version of the 
maximal function relevant for H}. Recall that if we fix a suitable func¬ 
tion that, for example, is bounded and has compact support, then for 
any / G L 1 , if ^> e = e~ d ^(x/e), then 

(/ * ^> e )(x) —> /(x), as e ^ 0, for a.e. x, 
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under the assumption f ^(x) dx = 1. 

Given this 伞 we define the maximal function M corresponding to 
the limit above by 

(41) M(/)(x) = sup |(/ * D(x)|. 

e>0 

Note that by what we have said it is easy to observe that for every 

/ € 


/(^)| < M(f)(x) < c/*(x )， for a.e. x ， 


where c is a suitable constant. 

We shall also want to assume that 伞 has some smoothness, as indicated 
above. With this in mind we can state our result as follows. 

Theorem 6.1 Suppose 电 is a C l function with compact support on R d . 
With M defined by (41) we have that M(/) E L 1 whenever f E 
Moreover 

(42) ||^(/)||l 1 (E) < ^||/||Hi(E d )- 

Before coming to the proof, which is very similar to that of the Hilbert 
transform, we make some additional remarks. 

• In the definition of M we have assumed that the function that 
enters has one degree of smoothness. Less could be assumed with 
the same result (for example a Holder condition of exponent a 
with 0 < a < 1), but some degree of smoothness is necessary. (See 
Exercise 22.) 

• In fact, the inequality (42) can be reversed. Thus there is a converse 
theorem that gives the maximal characterization of Hj;. This is 
formulated in Problem 6*. 

Proof, Suppose / is in Hl(R d ) and / ^ is an atomic de¬ 

composition. Then clearly M(f) < ^ |Afc|M(afc), and thus it suffices to 
prove (42) when / is an atom a. 

In fact, note that with a r defined as a r (x) = r d a(rx) ,r > 0, we have 
(a r * ^> e )(x) = r d (a * ^> er )(rx), and hence M(a r )(x) = r d M(a)(rx). Also 
the mapping a M(a) clearly commutes with translations. Therefore 
in proving (42) we may assume that the atom a is associated to the unit 
ball (centered at the origin). 
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Now we consider two cases; when \x\ < 2 and when \x\ > 2. In the 
first case clearly < c and hence J^ <2 dx < d. In the 

second case, we write 

a(y)0 

=rd I ， 卜 (〒)- K ?)] 办， 

since f a(y) dy = 0. However since \x\ >2 and \y\ < 1, we have that \x — 
y\ > \x\/2. Moreover since G C 1 we have that |^> (~^) — ^ (f) | < 
c|y|/e < c/e. In addition the fact that 伞 has compact support implies 
that (a * 伞 e ) ㈤ vanishes unless < A for some bound A, which in 

turn means that e > \x\ / (2^4). Altogether then 



(a * ^> e )(x) = e~ d 


— $ ( 王 ) < ce~ d ~ l < c'lxl'^ -1 

for those x. As a result Jj^| >2 M(a)(x) dx < c. Therefore (42) is estab¬ 
lished and the theorem is proved. 


6.1 The space BMO 

In the same sense that the real Hardy space H^(M d ) is a substitute for 
L 1 (R d ), the space BMO(R d ) is the corresponding natural substitute for 
the space L°°(R d ). 

A locally integrable function / on R d is said to be of bounded mean 
oscillation (abbreviated by BMO) if 

(43) sup — 

m 

where the supremum is taken over all balls B. Here fs denotes the 
mean-value of / over namely 

fB = —T^r [ f{x)dx. 

MB) J B 

The quantity (43) is taken as the norm in the space BMO, and is denoted 
by H/IIbmo- 

We first make some observations about the space of BMO functions. 

• The null elements of the norm are the constant functions. Thus, 
strictly speaking, elements of BMO should be thought of as equiv¬ 
alence classes of functions, modulo constants. 


^ j 1/ ㈤- fB\dx < CX), 
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• Note that if (43) holds with possibly different constants cb instead 
of fs-, then / would still be in BMO. Indeed, if for all B 

~~7^ f \f{x)-c B \dx<A 
m {B) J B 

then necessarily \ Jb — c,b| < ^4 and hence ||/||bmo ^ 2A. It is also 
easy to verify that one would obtain the same space, (with an 
equivalence of norms), if the balls appearing in (43), were replaced 
by, say, the family of all cubes. 

• If / E L°° then it is obvious that / is in BMO. A more typical 
example of a BMO function is f(x) = log |x|. Like the general 
BMO function it has the property that it belongs (locally) to every 
L q space, with q < oo. It also exemplifies a property shared by 
BMO and the L°° space: whenever f(x) belongs to one of these 
spaces, then so does the scaled function f(rx), r > 0, with the 
norm remaining unchanged. (For more about the above remarks, 
see Exercise 23, and Problems 3 and 4.) 


• The space of real-valued BMO functions forms a lattice, that is, if 
f and g belong to BMO then so do min(/ ， 分 ） and max (/， g). This is 
because |/| is in BMO whenever / is, which in turn follows from the 
fact ll/l - \f\ B \ <\f - !bV However, if / E BMO and \g\ < |/|, it 
is not necessarily true that g belongs to BMO. 

• From the above, we also deduce that if / E BMO is real-valued, and 
/ ⑻ is the truncation of / defined by f( k )[x) = /(x), if \f{x)\ < 
k\ f ⑻ (x) — k if f(x) > k\ and f ⑻ [x) 二 — fc, if f(x) < — fc, then 
{/( fc )} is a sequence of bounded BMO functions so that |/( fc )| < |/| 
for all fc, /( fc ) — / for a.e x a,s k oo, and hence |j/( fc )||BMO — 
II/IIbmo as fc —> cx). 

If / is complex-valued, one may apply this to both the real and 
imaginary parts of /. 

Our focus now will be on the key fact that BMO is the dual space of 
the Hardy space Hj；. This assertion means that every continuous linear 
functional i on Hj: can be realized as 

(44) £(f) = j f{x)g{x)dx, f e H l r , 

JR d 


for some element g in BMO, when (44) is suitably defined. In fact, a little 
care must be exercised when dealing with the pairing (44): for general 
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/ g Hj, and g ^ BMO, the integral need not converge. See Exercise 24. 
Thus we proceed indirectly, defining i first on a dense subspace of Hj；. 
This will be i?Q, the subspace of finite linear combinations of atoms. 
Note that every element of Hq is itself a multiple of an atom. Also，if 
f g Hq the integral converges, and the ambiguity of the BMO element 
g (that is, the additive constant) disappears because f f dx = 0. 

Our basic result then states: 

Theorem 6.2 Suppose g G BMO. Then the linear functional i defined 
by (44 )， initially considered for f G Hq, has a unique extension to Hj 
that satisfies 

ll^ll < c|| 夕 || B mo. 

Conversely, every bounded linear functional £ on Hj ； can be written as (44) 
with g G BMO and 

II^IIbmo < 

Here ||^|| stands for || 彳 ||( 印 )*, the norm of ^ as a linear functional on Hj. 

Proof. Let us first assume that g G BMO is bounded. Start with a 
general / G Hj, and let / 二 Ylh=i 入 be an atomic decomposition. 
Then by the convergence of the sum in the L 1 norm we get £(f) = 
12 ^kf afcC/. But 

J a k {x)g(x)dx = J a k (x)[g(x) -~g Bk ]dx, 

where is supported in the ball B^. However |afc(x)| < m ^) and thus 

町 ) k E |Afcl ^ky 乂少⑷ — 9Bk 1 dx - 

Therefore considering all possible decompositions of / then gives 

J dx < ||/|| h i||^I|bmo, 

under the assumption that g is bounded. Next, if we restrict ourselves to 
/ G Hq (in particular to / that are bounded) with a general g in BMO, 
and let g 、 k 、 be the truncation of g (as defined above), then the fact that 

J f{^) 9 {k) (x) dx < ||/|| H i||^ (fc) | BMO 
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just proved, together with a passage to the limit as k — oc using the 
dominated convergence theorem, shows that 


m = 


J f{x)g(x)dx 


< c II/IIhi||^II BMO, 


whenever / G Hq and g G BMO. Thus the direct conclusion of the the¬ 
orem is established. 

To prove the converse we will test the given linear functional t on 
atoms, and here it will be convenient to test i on ^-atoms, with p = 2. 

For this purpose fix a ball B and consider the L 2 space on B with 
norm 


II/IIl 2 s = (乂 Ifix^dx^j , 

and let 0 denote the subspace of those / 6 L 2 B for which f f(x) dx = 
0. Note that the ball ||/|| L | q < of 0 consists of exactly the 

2-atoms associated to B. 

Let us assume our linear functional i has been normalized so that its 
norm is less than or equal to 1. Then restricting ourselves to / E L 2 B 0 

we see that |^(/)| < II/Hhi < cm(B ) 1 / 2 ||/|| L 2 , the last inequality be- 
ing a consequence of (38) in Corollary 5.3. Thus by the Riesz rep¬ 
resentation theorem for L 2 B 0 (or as a simple consequence of the self¬ 
duality of L 2 spaces) there is a g B E L 2 B 0 , so that £{f) = f fg B dx, for 
f ^ We also have that || 夕 < cm(B) 1 / 2 ，because ||€|| 心 |。 S 

cm(B) 1 , 2 , as we have seen above. Hence for each ball B we have 
a function g B defined on B. What we want is a single function g 
so that for each B, g and g B differ by a constant on B, To con¬ 
struct this g note that if B\ C B 2 then g Bl — g B2 is a constant on B\, 
since both g Bl and g B2 give the same linear functional on L 2 Bi 0 . Now 
replace each g B by g B = g B + cb, where the constant cb is so cho¬ 
sen that 9 B dx 二 0. As a result g Bl = g B，2 , on B\ if B\ C B 2 . 

Therefore we can unambiguously define g on M. d by taking g(x) = 9 B {x), 
for x G B and B any ball. Now observe that f B |^(x) — CB\dx < 

m(B)- 1 / 2 ||/ - c b \\ L 2 < m(B)-^ 2 \\g B \\ L 2 < c. Therefore g e BMO, 

D D , U 

with || 分 ||bmo ^ c . Since the representation has been established for 
/ G L 2 b 0 and all B, it holds for the dense subspace Hq. The proof of the 
theorem is therefore concluded. 
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7 Exercises 


1. Show that an inequality 

||{an}||L ， <A\\f\\ L ^ for all / e 

with a n = ^ / 0 27r f{0)e~ ind dO, is possible only if l/p + l/q < 1. 

[Hint: Let Dn(0) = 2Z| n |< N e zn6> be the Dirichlet kernel. Then ||D 』 v||lp = 7V 1_1 / p 
as oo, if p > 1 and H^nIL 1 ~ l°g 

2. The following are simple generalizations of the Hausdorff-Young inequalities. 

(a) Suppose {<Pn} is an orthonormal sequence on L 2 (X,//). Assume also that 

Wn{x)\ < M for all n. U a n = f then ||a n ||Lg < M (2/p) — 1 ||/|| LP(X) ， 

1 < P < 2, l/p+ \/q = 1. 

(b) Suppose f G L p on the torus T' and a n = f J(i f(x)e~ 27rin x dx, n G Z d . 
Then ||{a n }|| L< 7 (Z ci) S ||/|| L P (T ci) , where l/q < 1 - 1/p. 

A 

3. Check that an inequality of the form ||/|L < 7 (R d ) ^ 乂 ll/ILp(R d ) (holding for all 
simple functions /) is possible if and only ii 1/p l/q = 1. 

[Hint: Let f r (x) = f(rx)，r > 0. Then / r (0 = f(^/r)r~ d .] 

4. Prove that another necessary condition for the inequality in the previous exer¬ 
cise is that p < 2. In fact the estimate 

[\f(0\di<A\\f\\ LP 
^Kl<i 

can hold only if p < 2. 

[Hint: Let f s (x) = (S -d / 2 e -7r ' x l 2//s , 5 = cr + it, cr > 0. Then (/ s )(0 = e _7rs ^' 2 . 
Note that ||/ s ||lp 乞 qI^/p- 1 / 2 ) when cr = 1, and let i —>• 00 .] 

5. Let ^ be the conformal map of the strip 0 < Re(^) < 1 to the upper half-plane 

defined by ip(z) = e l7rz . Check that ^{z) = is continuous on the closure of 

the strip, |$(^)| = 1 on the boundary lines, but $(z) is unbounded in the strip. 

6. Extend the Riesz convexity theorem (in Section 2) to the L p ， r spaces discussed 
in Exercise 18 of Chapter 1. We assume T is a linear transformation from simple 
functions to locally integrable functions. Suppose 

II 了 (/)||L«o， a o < Mo||/||lpo^'o , and II 了 (/)||l<?i. s i < Mi ||/||LPi' r i 

for all simple /. Prove that as a consequence ||T(/)||l < ?-s < Me\\f\\ lp- 7 ' where 
^ = ^ + I = + A I = + A i = i^ + A and 0 〈汐 < 1. 

P PO PI ? r r 0 ri ^ q q 0 qi ? s s () ' — — 
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[Hint: Suppose / and g are a pair of simple functions with ||/||lp^ < 1 and 
II/IIlq^. 3^ < 1* Define 

f z = i/(M)r( 2 )|^^ii /(_, ⑷， 

where a(z) = ^ (3(z) = ^ Note that when z = 0, then f z = /. 

Also 

||/i+it||wi S 1 and ||/o+itH lpo^o < 1. 

Make an analogous definition for g z and consider f T(f z )g z dxdt] 

7. Suppose / is a bounded function on M with compact support. Then H(f) G 
L^R) if and only if f f dx = Q. 

[Hint: U a = f f dx, then H(f)(x )= 念 + 0(l/x 2 ) as |x| —^ oo.] 

8. Suppose T is a bounded linear transformation mapping the space of real-valued 
L p functions into itself with 

||T(/)|| L p <M||/|| L P. 

(a) Let T r be the extension of T to complex-valued functions: T f (fi -f 2 / 2 )= 
T(/i) + i7X,2). Then T f has the same bound: ||T’(/)||lp S M||/||lp. 

(b) More generally, fix any N, then 

ii(E \nfj)\ 2 ) 1/2 \\L P < m\\(J2 i/,i 2 ) 1/2 ik P . 

j=i j=i 

[Hint: For part (b), let ^ denote a unit vector in R N , and let = Ej 二 

^ = (^ 1 ,. • •,Then f \(TF^)(x)\ p < M p f |F^(x)| p . Integrate this inequality 
for ^ on the unit sphere.] 

9. Show the identity of the following two classes of harmonic functions u in the 
upper half-plane R+ = {z : x + iy, y > 0}. 

(a) The harmonic functions u that are continuous in the closure and that 
vanish at infinity (that is, u(x, y) —>■ 0, if \x\ + y ^ 00 ). 

(b) The functions representable as u(x, 2/) = (/* where V y {x) is the 

Poisson kernel 士 ， and / is a continuous function on R that vanishes 
at infinity (that is, f(x) —^ 0, as |x| —>• 00 ). 

[Hint: To show that (a) implies (b), let f(x) = u{x, 0). Then T>(x ， y) = u(x ， y) — 
(/ * ^y){ x ) is harmonic in continuous on vanishes at infinity, and moreover 
0) = 0. Thus by the maximum principle, T>(x, y) = 0.] 


10. Suppose / E L P (E). Verify that ： 
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(a) ||/*P y ||LP W < II/IIlp, 1 <P<cx). 

(b) / * —► /, as y —► 0, in the L p norm, when 1 < p < oo. 

11 . Assume / E L P (M), 1 < p < oo. Prove that: 

(a) f^Q y H(f) * V y , where i/, Py and Q v are respectively the Hilbert trans¬ 
form, the Poisson kernel and the conjugate Poisson kernel. 

(b) f * Q y 一 H(f) in the L p norm, as y —> 0. 

(c) H e (f) — H(f) in the L p norm, as 6 —► 0. 

[Hint: Verify (a) first for / E L 2 by noting that the Fourier transform of both sides 
equals /(O SIg ~| ■ ⑹ e~ 27r ^' v .] 

12. In suppose f(x) = |x| _d (log l/|x|) -1-<5 if |x| < 1/2, f(x) = 0 otherwise. 
Then observe that f*(x) > c|x| _d (log l/|x|) -<5 , if |x| < 1/2. Hence if 0 < <5 < 1, we 
have / E L 1 (M d ) but f*(x) is not integrable over the unit ball. 

13. Prove that the basic distribution function inequality (28) for the maximal 
function can essentially be reversed, that is, there is a constant A so that 

m({x : f*(x) > a}) > {A/a) f \f(x)\dx. 

J\f(x)\>oc 

[Hint: Write E a = {x : f*(x) > a} as Ujli Qj^ with Qj closed cubes satisfy¬ 
ing (37), with Q, = Eql- For each Qj let Bj be the smallest ball so that Qj C Bj^ 
and Bj intersects E^. First m(Bj) < cm(Qj), then 爪士 ) f B |/| dx < a. Thus 
_ 1 _ 1 

m{Qj) > f B |/(x)| dx > Jq |/(x)| dx. Now add in j, and use the fact 
that {x : |/(x)| > a} C {x : f*(x) > a}.] 

14. Deduce the following important consequence from (28) and the previous ex¬ 
ercise. Suppose / is an integrable function on M d , and B \, B 2 are a pair of balls 
with C B 2 . 

(a) /* is integrable on B\ if |/| log(l + |/|) is integrable on B 2 . 

(b) In the converse direction, whenever /* is integrable on B\ then |/| log(l + 
I/I) is also integrable there. 

[Hint: Integrate the inequalities in a, for a >1-1 

15. Consider the weak-type space, consisting of all functions / for which m({x : 
|/(x)| > a}) < ^ for some A and all a > 0. One might hope to define a norm 
on this space by taking the “norm” of / to be the least A for which the above 
inequality holds. Denote this quantity by A/*(/). 
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(a) Show, however, that M is not a genuine norm; moreover there is no norm 
|| . || on this space so that ||/|| is equivalent with A/*(/). 

(b) Prove also that this space has no non-trivial bounded linear functionals. 

[Hint: Consider R. The function f(x) = l/|x| has Af(f) =2. But if /n(x) = 
+ 1) + f(x + 2) + ... + N)], then V(/n) > clogN.] 

16. Prove that the space H} is complete as follows. Let {/ n } be a Cauchy sequence 
in H^. Then since {/ n } is also Cauchy in L 1 , there is an L 1 function / so that 
/ = lim n ->oo fn in the L 1 norm. Now for an appropriate sub-sequence {nfc}, write 

/ — fni + Ylk=lif n k + 1 — f n k )• 

17. Consider the function / defined by f(x) = l/(x(logx) 2 ) for 0 < x < 1/2 and 
/(x) = 0 if x 1/2， and extended to x < 0 by f (x) = _ f ( _ x). Then f is inte- 
grable on M, with f f = 0, hence / is a multiple of a 1-atom in the terminology of 
Section 5.2. 

Verify that M(f) > c/(|x| log |x|) for |x| < 1/2, hence M(/) ^ L 1 , thus by The¬ 
orem 6.1 we know that 

18. Show that there exists a c > 1 so that every / E L 1 (M d ) can be written as 
f(x) — ^2^ =1 Afcafc(x), with ^ |Afc| < c||/|| L i, where the afc are “faux” atoms: each 
dfc is supported in a ball Bk ； |afc(x)| < l/m(Bk) for all x\ but afc does not necessarily 
satisfy the cancelation condition J cifc(x) dx = 0. 

[Hint: Let f n = E n (/), where E n replaces / by its average over each dyadic cube of 
the n th -generation. Then ||/ n - /|| L i — 0. Pick {n k } so that ||/ nfc+1 - fn k \\ L ^ < 

l/2 fc , and write / = / ni + 5ZfcLi(/ n fc+i — /〜).] 

19. The following illustrates two senses in which Hj: is near L 1 , but yet different. 

(a) Suppose fo(x) is a positive decreasing function on (0, oo) that is integrable 
on (0, oo). Then show that there is a function / E Hj(E) so that |/(x)| > 
/o(W). 

(b) However if / E Hj:(M d ), and / is positive on an open set, then its size must 
be “smaller” on that open set than a general integrable function. In fact, 
prove that if f G Hj:, and / > 0 in a ball Bi, then /log(l + /) must be 
integrable over any proper sub-ball Bo C B\. 

[Hint: For (a) take f(x) = sign(x)/o(|x|), and find an atomic decomposition for f. 
For (b) use Exercise 14, together with the maximal theorem in Section 6, with 中 
positive.] 

20. When / G L 1 (M d ) we know that its Fourier transform / is bounded and /(《) 
tends to 0 as |^| —► oo (the Riemann-Lebesgue lemma), but no better assertion 

A 

about the “smallness” of / can be made. (For the analogous result for Fourier 
series, see Chapter 3 in Book III.) Show, however, that for / E we have 



94 


Chapter 2. L p SPACES IN HARMONIC ANALYSIS 


[Hint: Verify this for atoms.] 

21. Prove that if |/(x)| < A(1 + |x|) _d_1 , and f Rd f(x)dx = 0, then / G H^(M d ). 

[Hint: While this is elementary, it is a little tricky. Write / = 九 ， where 

f 0 (x) = f{x) if |x| < 1, 0 elsewhere, and }k{x) — f(x) if 2 k ~ 1 < |x| < 2 fc and 0 
elsewhere, and A: > 1. Let Ck = f fk dx, Sk 二 J2j>k c j ，then so = 0. Fix a bounded 
function r] supported in |x| < 1, with f r](x) dx = 1. Now write f(x ) : =Er =0 (/fc - 
CkT]k) + CkTjk, where r]k(x) = 2~ kd r](2~ k x) and /% = 1. The first sum is 

clearly a sum of multiples of atoms (which are 0(2 -fc )) supported on the balls |x| < 
2 fc . That the second sum is similar can be seen by rewriting it as Sk(r]k — 

m-i)-} 

22. Let / be the atom on M supported in |x| < 1/2 given by f(x) = sign(x). Apply 
to / the maximal function /q defined by 

/o (x) = sup |(/* x e ) ㈤ I, 

e>0 

where x is the characteristic function of |x| < 1/2 and \e(x) = e^xix/e). 

Verify that |/o(x)| > l/(2|x|) if |x| > 1/2 hence /o ^ L 1 . Thus the maximal 
function /q , defined in terms of x, cannot be used to characterize the real Hardy 
space H^:. 

23. Verify the following examples related to BMO: 

(a) log |x| G BMO(M d ). 

(b) If f(x) = logx, when x > 0, and = 0 when x < 0, then / ^ BMO(M). 

(c) If <5 > 0, (log \x\) s € BMO(R d ) if and only if <5 < 1. 

[Hint: With f(x) = log |x|, note that f(rx) = f(x) + c r and so we may assume the 
ball B has radius 1 in testing the condition (43). For (b), test / on small intervals 
centered at the origin.] 

24. Using Exercises 19 (a) and 23, give examples of / E Hj: and g E BMO so that 
\f(x)g(x)\ is not integrable over M d . 


8 Problems 

1. Another way Hj; is an improvement over L l is in its weak compactness of 
the unit ball. The following can be proved. Suppose {/ n } is a sequence in Hj: 
with ||/n|| H i < A. Then we can select a subsequence {fn k } and find an / E Hj; so 
that f fn k (x)(p(x) dx f f(x)(p(x) dx, as k ^ oo, for every ip that is a continuous 
function of compact support. 

This is to be compared with L 1 , and the failure there of weak compactness as 
described in Exercises 12 and 13 in the previous chapter. 
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[Hint: Apply the result in Problem 4 (c) of the previous chapter to obtain a 
subsequence {fn k } and a finite measure /z so that f Uk — > /z in the weak* sense. 
Next use the fact that if sup e>() |/z * 外 | G L 1 , for an appropriate then fi is 
absolutely continuous.] 

2. Suppose H p is the complex Hardy space defined in Section 5. For 1 < p < oo, 
prove the following: 

(a) If F E H p ， then lim y ^o F(x -}- iy) = Fo(x) exists in the L P (M) norm. 

(b) \\F\\ H p = UFoIIlp. 

(c) One has 2Fo = f + with / real-valued in L P (E), and ||Fo||lp ~ ||/||lp- 

Moreover, every Fo (and thus F) arises this way. This gives a linear isomor¬ 
phism (over the reals) of H p , with L p with an equivalence of norms. 

[Hint: Here is an outline of the proof. For each yi > 0, write F yi (z) = F(z^- 
iyi) and (z) = F yi (z)/(l — iez), e > 0. One has that F yi is bounded (see 
Section 2, in Chapter 5, Book III). Thus by Exercise 9 ， Fy x (z) = (F^ * V y ){x). 
Now using the weak compactness of the unit ball in L p , (Exercise 12 in Chapter 1), 
we can find Fo E L p so that Fy x (x) —> Fo(x) weakly as e and y\ —> 0. Observe that 
this breaks down for p = 1. Conclusion (c) is then essentially a restatement of the 
boundedness of the Hilbert transform for 1 < p < oo.] 

3. Let P be any non-zero polynomial of degree k in R d . Then / = log |P(x)| is in 
BMO and ||/||bmo < c k) where Cfc depends only on the degree k of the polynomial. 

[Hint: Verify the result first when d = 1. Then use induction in the dimension and 
the following assertion, stated for E 2 . Suppose f [x, y), (x,y) G M 2 is for each y a 
BMO(IR) function in x, uniformly in y. Assume also that this holds when the roles 
of x and y are interchanged. Then / E BMO(M 2 ).] 

4. Prove the following John-Nirenberg inequalities for every / E BMO(M d ): 

(a) For every q < oo there is a bound b q so that 

sup J^\f ~ fB\ q dx < ^||/||| mo . 

(b) There are positive constants /z and A, so that 

sup - 1 f dx < A, whenever ||/||bmo < 1. 

b Tn(B) J B 

[Hint: For (a) test / against p-atoms, where p is dual to q. For (b) use the bound 
c p = 0(1/(p — 1)) as p —> 1 (in (38)) to obtain b q = 0(q)^ as q ^ oo. Then write 

eU = J2^Lo uQ /Q-\ 

5. The Hilbert transform of a bounded function is in BMO. Show this in two 
different ways. 
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(a) Directly: Suppose / is bounded (and belongs to some L p , 1 < p < oc). Then 
H(f) G BMO with 

||^(/)||bmo < 4||/||l ①， 
with A not depending on the L p norm of /. 

(b) By duality, using Theorem 5.4. 

[Hint: For (a), fix any ball B, and let B\ be its double. Consider separately, fxB 1 
and fxBf] 

6. * The following is the maximal characterization of Hj:(E d ). Suppose belongs 

to the Schwartz space <S and f ^(x) dx ^ 0. Let = sup e>0 |(/ * D(o:)|, 

for / E L 1 . Then 

(a) / is in Hj: if and only if M(f) belongs to L 1 . 

(b) The condition E <S can be relaxed to require only 

(c) Note two interesting examples, first ^ t i/ 2 (x) = (47rt)~ d ^ 2 e~^ x ^ /( 4t ): then 
u(x, t) ~ (f * ^> t i/ 2 )(x) is the solution of the Heat equation A x u = dtu, with 

initial data u(x, 0) = f(x). Also, ^>t(x) = - Ctft ^ +1 with Cd = T(^ L )/n^~ 

(t 2 + |x| 2 ) 2 

so that u(x,t) = (/ * ^t)(x) is the solution to Laplace’s equation A x u + 
dtu = 0, with initial data u(x ， 0) = /(x). (Here T denotes the gamma func¬ 
tion.) 

7. * H p ) when p = 1. The results in (a) and (b) of Problem 2 also hold for p = 1 ， 

but require a different proof. The analog of (c) is as follows. One has 2Fq = 
f + where / belongs to the real Hardy space Hj:. Also ||Fo|| l i ~ ||/|| H i • 

As a consequence, a necessary and sufficient condition that / E Hj: is that both / 
and H(J) are in L 1 . 

The conclusions (a) and (b) may be proved by showing that any F E H 1 can 
be written as F = Fi . F 2 with Fj G H 2 and ||Fj||^ 2 = ||F|| h i ， and then using the 
corresponding results in H 2 . 

8* Suppose / E L 1 (E). Then we can define H(f) G L^E) in the weak sense to 
mean that there exists g E L X (R) so that 

I g(pdx — I fH(ip) dx, for all functions ip in the Schwartz space. 

Jr Jr 

Then we say g = H(f) in the weak sense. 

As a consequence of Problem 7*, one has that / E Hj:(E) if and only if / E L 1 (M) 
and //(/), taken in the weak sense, also belongs to L 1 (R). 


9.* Let {/ n } be a sequence of elements in H} so that ||/n||H; < M < 00 for all n. 
Assume that f n converges to / almost everywhere. Then: 
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(a) / G Hj:. 

(b) f f n g — f fg-, as n —> oo, for all g continuous with compact support. 

A corresponding result holds for L p ， P > 1， but fails for p = 1. See Exercise 14 in 
Chapter 1. 


10.* The following result illustrates the application of Hj: to the theory of com¬ 
pensated compactness. 

Suppose A = (Ai ,.. • ， Ad) and B — (Bi, ... ， Bd) are vector fields in with 
Ai^Bi G L 2 (M d ) for all i. The divergence of A is defined by 

• f ： 铪， 

k=i Xk 


and the curl of B is the d x d matrix whose zj-entry is 


/ 1/D 、、 dBt dBj 

(curl ⑼)。 二 a 


(The derivatives here are taken in the sense of distributions, as in the next chapter.) 
If div (乂 ）= 0 and cur\(B) = 0 then AkBk G H^. 

This is in contrast with the result that in general, if /, ^ E L 2 , then one only has 

fgeL\ 



Distributions: Generalized 
Functions 


The heart of analysis is the concept of function, and 
functions “belong” to analysis, even if, nowadays, they 
occur everywhere and anywhere, in and out of math¬ 
ematics, in thought, cognition, even perception. 

Functions came into being in “modern” mathe¬ 
matics, that is, in mathematics since the Renaissance. 
By a rough division into centuries, the 17th and 18th 
centuries made preparations, the 19th century created 
functions of one variable, real and complex, and the 
20th century has turned to functions in several vari¬ 
ables, real and complex. 

S. Bochner, 1969 


... It was not accidental that the notion of function 
generally accepted now was first formulated in the cel¬ 
ebrated memoir of Dirichlet (1837) dealing with the 
convergence of Fourier series; or that the definition 
of Riemann’s integral in its general form appeared in 
Riemann’s Habilitationsschrift devoted to trigonomet¬ 
ric series; or that the theory of sets, one of the most 
important developments of nineteenth-century mathe¬ 
matics, was created by Cantor in his attempts to solve 
the problem of the sets of uniqueness for trigonometric 
series. In more recent times, the integral of Lebesgue 
was developed in close connection with the theory of 
Fourier series, and the theory of generalized functions 
(distributions) with that of Fourier integrals. 

A. Zygmund, 1959 


The growth of analysis can be traced by the evolution of the idea 
of what a function is. The formulation of the notion of “generalized 
functions” (or “distributions” as they are commonly called) represents a 
significant stage in that development with ramifications in many different 
areas. Looking back, one can see that this concept had many antecedents. 
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Among these were: Riemann’s formal integration and differentiation of 
trigonometric series in his study of uniqueness; the necessity of using 
weak solutions in the theory of partial differential equations; and the 
possibility of realizing a function (say in L p ) as a linear functional on an 
appropriate dual space. The importance of distributions derives from the 
ease with which this tool permits us to carry out formal manipulations, 
finessing numerous technical issues. While as such it is not a panacea, it 
allows us, in many instances, to arrive more quickly at the heart of the 
matter. 


We divide our treatment of distributions in two parts. First, we set 
down the basic properties of general distributions and the rules of their 
manipulation. Thus we see that an ordinary function has derivatives of 
all orders in the sense of distributions. Also in that sense, any function 
that does not increase too fast at infinity has a Fourier transform. 

Next, we study specific distributions of particular importance, begin¬ 
ning with the principal-value distribution defining the Hilbert transform, 
and more general homogeneous distributions. We also consider distribu¬ 
tions that arise as fundamental solutions of partial differential equations. 
Finally, we take up the Calderon-Zygmund distributions that occur as 
kernels of singular integrals generalizing the Hilbert transform, and for 
these we obtain basic L p estimates. 


1 Elementary properties 

Classically a function / (defined on M. d ) assigns a definite value f(x) 
for each x G For many purposes, it is often convenient to relax this 
requirement by allowing / to remain undefined at certain “exceptional” 
points x. This is particularly so when dealing with integration and mea¬ 
sure theory. Thus in that context a function can be unspecified on a set 
of measure zero. 1 

In contrast to this, a distribution or generalized function F will 
not be given by assigning values of F at “most” points, but will instead 
be determined by its averages taken with respect to (smooth) functions. 
Thus if we are to think of a function / as a distribution F, we determine 
F by the quantities 

⑴ F{ip) = [ f{x)ip{x)dx, 

JR d 

where the (p’s range over an appropriate space of “test” functions. There¬ 
fore, in keeping with (1), our starting point in defining a distribution F 


1 More precisely, a function is then really an equivalence class of functions that agree 
almost everywhere. 
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will be to think of F as a linear functional on a suitable space of these 
test functions. 

Actually, there will be two classes of distributions (each with its space 
of test functions) that we will consider: the broader class, which we deal 
with first, and which can be defined on any open set Q of M d ; also later, 
a narrower class of distributions defined on those which are suitably 
“tempered” at infinity, and that arise naturally in the context of Fourier 
transforms. 

1.1 Definitions 

We fix an open set Q in M. d . The test functions for the larger class of 
distributions will be the functions that belong to C§° (fl), the complex¬ 
valued indefinitely differentiable functions of compact support in $1. In 
keeping with a common notation used in this context we denote this 
space of test functions as T> (or more explicitly as T>(Q)). Now if {<^n} is a 
sequence of elements in V, and also G we say that {p n } converges to 
(p in V, and write cp n cp in V, if the supports of the ip n are contained in 
a common compact set and for each multi-index a, one has d^(p n — 
uniformly in x as n — oc. 2 With this in mind we come to our basic 
definition. A distribution F on $1 is a complex-valued linear functional 
(p ^ F(P), defined for ip E that is continuous in the sense that 

F(ip n ) —> F((p) whenever (p n cp in V. The vector space of distributions 
on is denoted by V*(Q). 

In what follows we shall tend to reserve the upper case letters F, 
G, … for distributions, and the lower case letters / ，仏 … for ordinary 
functions. First, we look at a few quick examples of distributions. 

Example 1. Ordinary functions. Let / be any locally integrable func¬ 
tion on $1. 3 Then / defines a distribution F = Ff, according to the 
formula (1). Distributions arising this way are of course referred to as 
“functions.” 

EXAMPLE 2. Let /x be a (signed) Borel measure on Q which is finite on 
compact subsets of Q (sometimes called a Radon measure). Then 

F(W= f <f{x) dfi{x) 


2 We recall the notation: = (d/dx) a = [d/dxi) ai ... (d/dX(i) ad ^ |a| = ai + … + 

Q^, and a! = qi • • • q^, where a = (ai ， … ， a^). 

3 By this we mean that / is measurable and Lebesgue integrable over any compact 
subset of (Compare this definition with the one in Chapter 3 of Book III, where it has 
a slightly different meaning.) 
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is a distribution which, in general, is not a function as above. The special 
case, when ji is the point-mass which assigns total mass of 1 to the origin, 
gives the Dirac delta function 5, that is, 5((/?) = p(0). (Note, however, 
that 5 is not a function!) 

Further examples arise from the above by differentiation. In fact, a 
key feature of distributions is that, as opposed to ordinary functions, 
these can be differentiated any number of times. The derivative d^F 
of a distribution generalizes that of a differentiable function. Indeed, 
whenever / is a smooth function on Q and (say) (p G then an 

integration by parts yields 

j 咒 f) p dx = j f(d^)dx. 

Hence in keeping with (1) we define d^F as the distribution given by 

(d^F)(ip) — (-1)I q If , whenever ip G V(Q). 

Thus in particular, if / is a locally integrable function, we can define its 
partial derivatives as distributions. A few examples may be useful here. 

• Suppose h is the Heaviside function on 1R, that is, h(x) 二 1 for a: > 
0, and h(x) — 0, for a: < 0. Then dh/dx, taken in the sense of distri¬ 
butions equals the Dirac delta 5. This is because — J 0 °° ^p f {x) dx = 
(f(0), whenever cp G P(M). Note however that the usual derivative 
of h is zero when a: ^ 0, and is undefined at a: = 0. So we must be 
careful to distinguish the distribution derivative of a function, from 
its usual derivative (when it exists), if the function is not smooth. 
(See also Exercises 1 and 2.) 

A higher dimensional variant of the Heaviside function is given in 
Exercise 15. 

• Suppose the function / is of class C k on $1, that is, all the partial 
derivatives d^f with |a| < fc, taken in the usual sense, are continu¬ 
ous on Q. Then these derivatives of / agree with the corresponding 
derivatives taken in the sense of distributions. 

• More generally, suppose / and g are a pair of functions in L 2 (Q) 
and d^f = -g in the “weak sense” as discussed in Section 3.1 of 
Chapter 1, or in Section 3.1, Chapter 5 of Book III. If F and G 
are the distributions determined by / and g respectively, according 
to (1), then d^F = G. 



102 


Chapter 3. DISTRIBUTIONS ： GENERALIZED FUNCTIONS 


1.2 Operations on distributions 

As in the case of differentiation, one can carry over various operations on 
distributions by transforming the corresponding actions on test functions. 
We first give some simple examples. 

• Whenever F belongs to V* and ^ is a C°° function, then we can 
define the product ^ • F by (xp • F)(ip) = F^ip), for every ip 
This agrees with the usual pointwise definition of the product when 
F is a function. 

• For a distribution on M. d the actions of translations, dilations and 
more generally non-singular linear transformations can be defined 
by the corresponding actions on test functions via “duality.” Thus 
for the translation operator defined for functions by 丁 h(f)(x)= 
f(x — h), h G the corresponding definition on distributions is: 

Th{F)((p) = F(r-h(<f))^ for every test function ip. 

Similarly, for dilations given on functions / by the simple rela¬ 
tion f a (x) = /(ax), a > 0, one defines F a by F a (ip) = a~ d F(ip a -i). 
More generally, if L is a non-singular linear transformation then 
the extension of /l ⑷二 f(L(x)) to distributions is given by the 
rule Fl((P) = \det L\~ l F(ipL-i) for every ip 

It is important that one can also extend the notion of convolution, 
defined for appropriate functions on R d by 

(/* 分 )⑻ = / f{x - y)g(y) dy 

JR d 

to large classes of distributions. 

To begin with, suppose that F is a distribution on M. d and ip a test 
function. Then there are two ways that we might define F * ?/； (in keeping 
with (1) when F is a function). The first is as a function (of x) given by 

with ^xiy) = - y)- 

The second is that F * ^ is the distribution determined by 

(F * 4)(^) = 〜 * w), with 

Proposition 1.1 Suppose F is a distribution and jp EV. Then 

(a) The two definitions of F ^ ^ given above coincide. 

(b) The distribution F * ip is a C°° function. 
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Proof. Let us observe first that is continuous in x and in 

fact indefinitely differentiable. Note that if x n ^ xo as n — 00 ， then 
■ipxn = — y) — 0( x o - y) = ^xo(y) uniformly in y, and the same 

is true for all partial derivatives. Therefore xp; Q in V (as func¬ 

tions of y) as n — 00 , and thus by the assumed continuity of F on T> 
we have that F{^) is continuous in x. Similarly, all corresponding dif¬ 
ference quotients converge and the result is that F(^~) is indefinitely 
differentiable, with d^F{^) = F{d^^). 

It remains to prove conclusion (a), and for this it suffices to show that 

(2) /— F ，”)’ 一一 

However since ^ G P, and of course ip is continuous with compact sup¬ 
port, then it is easily seen that 


^{x - y)^p(y) dy = lim S (e) 

J 0 

where 5(e) = e d ^ n€Zd 4 〜 （x — ne)(p(ne). Here the convergence of the 
Riemann sums 5(e) to xp 〜 * is in T>. Clearly, 5(e) is finite for each 
e > 0, and thus F(S e ) = e d F« e )(p(ne). Hence by the continu¬ 
ity of a: a passage to the limit e — 0 yields (2), proving the 

proposition. 

A simple application of the proposition is the observation that every 
distribution F in M. d is the limit of C°° functions. We say that a sequence 
of distributions {F n } converges to a distribution F in the weak sense 
(or in the sense of distributions), if F n (ip) F(ip), for every ip 

Corollary 1.2 Suppose F is a distribution on M d . Then there exists a 
sequence {/ n }，with f n G C°° , and f n — F in the weak sense. 


Proof. Let {^ n } be an approximation to the identity constructed as 
follows. Fix a ^ G P with ^>{x) dx = 1 and set ^ n {x) = n d ^{nx). 

Form F n = F * xp n . Then by the second conclusion of the proposition, 
each F n is a C 00 function. However by the first conclusion 


Fn{^f )= : F(^ * ip) for every (p G V. 


Moreover, as is easily verified, xp; * p p in 2X Thus F n (ip) — F((f), 
for each (p and the corollary is established. 
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1.3 Supports of distributions 

We come next to the notion of the support of a distribution. If / is a con¬ 
tinuous function its support is defined as the closure of the set where 
f(x) 7 ^ 0. Or put another way, it is the complement of the largest open 
set on which / vanishes. For a distribution F we say that F vanishes 
in an open set if F((p) = 0, for all test functions ip EV which have their 
supports in that open set. Thus we define the support of a distribu¬ 
tion F as the complement of the largest open set on which F vanishes. 

This definition is unambiguous because if F vanishes on any collection 
of open sets then F vanishes on the union O = Uzgx In¬ 

deed suppose (/? is a test function supported in the compact set K C O. 
Since O covers the compact set K ， we may select a sub-cover which (af¬ 
ter possibly relabeling the sets (Di) wg can write bs K Cl A 

regularization applied to the partition of unity obtained in Section 7 in 
Chapter 1 yields smooth functions T]k for 1 < fc < iV so that 0 < Vk < 1, 
supp(ryfc) C Ok, and Vk{^) = 1 whenever x £ K. Then F((p) = 

F{J2k=i ^fc) = J2k=i = o, since F vanishes on each Ok- Thus 

F vanishes on O as claimed. 4 

Note the following simple facts about the supports of distributions. 
The supports of d^F and ^ - F (with ^ E C°°) are contained in the 
support of F. The support of the Dirac delta function (as well as its 
derivatives) is the origin. Finally, F((p) — 0 whenever the supports of F 
and (p are disjoint 

We observe next the additivity of the supports under convolution. 

Proposition 1.3 Suppose F is a distribution whose support is and 
is in V and has support C 2 . Then the support of F * ^ is contained in 
Ci H- C2 - 


Indeed for each x for which F(^) ^ 0, we must have that the support 
of F intersects the support of Since the support of ^ is the set x — 
C2 this means that the set C\ and x — C2 have a point, say y, in common. 
Because x 二 y + x — y, while y G Ci and x — y E C 2 (since y G x — C 2 ) 
we have that x E C\ C2, and thus our assertion is established. Note 
that the set C\ + C2 is closed because C\ is closed and C 2 is compact. 

We can now extend the definition of convolution to a pair of distribu¬ 
tions if one of them has compact support. Indeed, if F and F\ are given 
distributions with F\ having compact support, then we define F * Fi as 


4 One must take care that this notion of support does not coincide with the “sup¬ 
port” defined in Chapter 2 of Book III for an integrable function, when such function is 
considered as a distribution. A further clarification is in Exercise 5. 
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the distribution (F * Fi)(^p) = F(F^ * ip), where is the reflected dis¬ 
tribution given by F^(ip) = Fi(ip 〜）. This extends the definition given 
above when F\ = xp £ T>. Notice that if C is the support of Fi, then 
一 C is the support of Ff. Therefore by the previous proposition * ip 
has compact support and is C°°, hence it belongs to V. The fact that 
the mapping (p (F * Fi)(ip) has the required continuity in V is then 
straightforward and is left to the reader to verify. 

Other properties of convolutions that are direct consequences of the 
above reasoning are as follows: 

• If Fi and F 2 have compact support, then Fi * F 2 = F 2 * Fi. (For 
this reason we shall sometimes also write Fi * F for F * Fi, when 
only F\ has compact support.) 

• With S the Dirac delta function 

F^S — S^F = F. 

• If Fi has compact support, then for every multi-index a 

^(F*Fx) = (d^F) (^Fx). 

• If F and F\ have supports C and C\ respectively, and C is com¬ 
pact, then the support of F * Fi is contained in C + Ci. (This fol¬ 
lows from the previous proposition and the approximation stated 
in part (b) of Exercise 4.) 

1.4 Tempered distributions 

There are distributions on M. d that, roughly speaking, are of at most poly¬ 
nomial growth at infinity. The restricted growth of these distributions 
is reflected in the space S of its test functions. This space S = S(R d ) 
of test functions (the Schwartz space 5 ) consists of indefinitely differ¬ 
entiable functions on that are rapidly decreasing at infinity with all 
their derivatives. More precisely, we consider the increasing sequence of 
norms II • ||^, with N ranging over the positive integers, defined by 6 

|M|tv = sup |^(9^)(x)|. 

xeR d ,\a\,\P\<N 

5 The space S occurred already in Chapters 5 and 6 of Book I. 

6 We shall use the notation || * ||n throughout this chapter. This is not to be confused 
with the L p norm, || . \\lp . 
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We define S to consist of all smooth functions such that ||(^||at < oo for 
every N• Moreover, one says that ^ ip in 5, whenever \\ifk — flliv — 
0, as fc — oo, for every N • 

With this in mind we say that F is a tempered distribution if 
it is a linear functional on S which is continuous in the sense that 
F(ifk) F(ip) whenever (fk ^ m <5. We shall write 5* for the vec¬ 
tor space of tempered distributions. Since the test space V = V{R d ) is 
contained in 5, and convergence in V implies convergence in 5, we see 
that any tempered distribution is automatically a distribution on in 
the previous sense. However the converse is not true. (See Exercise 9). 
It is worthwhile to note that T> is dense in S in that for every function 

6 S, there exists a sequence of functions 外 such that ^ ip in 
5 as fc — oo. (See Exercise 10.) 

It is also useful to observe that any tempered distribution is already 
controlled by finitely many of the norms || - ||；v. 

Proposition 1.4 Suppose F is a tempered distribution. Then there is a 
positive integer N and a constant c > 0， so that 

\F((f)\ < c\\ip\\ N , for all (p eS. 

Proof. Assume otherwise. Then the conclusion fails and for each 
positive integer n there is a G 5 with ||^ n || n = 1, while \F{^ n )\ > n. 
Take ip n = 0n/ 几 1 / 2 . Then ||(^ n ||^ < ||^ n ||n as soon as n > TV, and thus 
ll^nlU < n -1 / 2 ^ 0 as n ^ oo, while \F((p n )\ > n” 2 — oc, contradict¬ 
ing the continuity of F. 

The following are some simple examples of tempered distributions. 

• A distribution F of compact support is also tempered. This follows 

from the fact that if C is the support of F, there is an r/ G V, with 
r](x) = 1 for all a: in a neighborhood of C, hence F(ip) : = F{r]^p) if 
(p e V. Thus the linear functional F defined on V has an obvious 
extension to S given by i—> F(r]ip), and this gives the correspond¬ 

ing distribution. 

• Suppose / is locally integrable on R d and for some iV > 0, 

I |/(x)| dx = 0(R n ), as i? — oo. 

J\x\<R 

Then the distribution corresponding to / is tempered. Hence in 
particular this holds if / G L p (R d ) for some p with 1 < p < oo. 
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• Whenever F is tempered so is d^F for all o；; also x^F(x) is tem¬ 
pered for all multi-index /3 > 0. 

The last assertion can be generalized as follows: let ip be any C°° function 
on which is slowly increasing: this means that for each a, d^7p(x)= 
0 (| 3 ：|^) as |x| —> oo, for some N a > 0. Then xpF defined by (xpF 、 (ip)= 
is also a tempered distribution, whenever F is tempered. 

The properties of convolutions of distributions discussed in Sections 1.2 
and 1.3 have modifications for tempered distributions. The proofs of the 
assertions below are routine adaptations of previous arguments. 

(a) If F is tempered and ^ E 5, then F * 也 defined as the function 
F{^) is C°° and slowly increasing. Moreover the alternate defini¬ 
tion (F * ^){^>) = F(xj ; 〜 * (p), for E 5, continues to be valid here. 
To verify this we need the fact that 0 〜 * G <S, whenever ^ and (p 
are in S. (See Exercise 11.) 

(b) If F is a tempered distribution and is a distribution of compact 

support, then F ^ F\ is also tempered. Note that (F * Fi)(ip ) — 
F(F{^ * (/?), and to establish the claim we need the implication 
that G 5, if Fi has compact support and (p ^ S. (See Ex¬ 

ercise 12.) 

1.5 Fourier transform 

The main interest of tempered distributions is that this class is mapped 
into itself by the Fourier transform, and this is a reflection of the fact 
that the space S is also closed under the Fourier transform. 

Recall that whenever p € 5, its Fourier transform ip A (also sometimes 
denoted by (f) is defined as the convergent integral 7 

ip A (。 = f (p(x)e~ 2nix '^ dx. 

JR d 

The mapping (p (p A is a continuous bijection of 5 to 5 whose inverse 
is given by the mapping 0 4 V , where 

劑 =/ me 2nix< d^. 

JR d 

In this connection it is useful to keep in mind the simple norm estimates 

W^pWn < cwiMU+d+i ， 

7 For the elementary facts about the Fourier transform on S that are used here, see for 
example Chapters 5 and 6 of Book I. 
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which holds for every (f e S and every N >0. (This estimate is itself 
immediate from the observation sup^ |^(0I < f R a \^p{x)\dx < ^||^||d+i.) 
The multiplication identity 

/ / ip(x)(p(x)dx = / ^(x)(p(x) dx 

JRd jRd 

(which holds for all G S) suggests the definition of the Fourier 

A. 

transform F A (sometimes denoted by F) for a tempered distribution F. 
It is 


F A (if) = F{ip A ), for all <peS. 

From this it follows that the mapping F h-> F a is a bijection of the space 
of tempered distributions, with inverse the mapping F ^ F v , where F v 
is defined by F v (c^) = F((/p v ). Indeed 

(^ A ) V M = F A (W ) 二 F((^) A ) = F(^p). 

Moreover the mappings F i—> F A and F h-> F v are continuous with con¬ 
vergence of distributions taken in the weak sense, that is, F n ^ F if 
F n ((p) F((p), as n — oc for all w € 5. (This convergence is also said 
to be in the sense of tempered distributions.) 

Next it is worthwhile to point out that the definition of the Fourier 
transform in the general context of tempered distributions is consistent 
with (and generalizes) previous definitions given in various particular 
settings. Let us take for example the L 2 definition via Plancherel’s theo¬ 
rem. 8 Starting with an / G L 2 (R d ), we write F = Ff for the correspond¬ 
ing tempered distribution. Now / can be approximated (in the L 2 norm) 
by a sequence {/ n }, with f n G S. Thus taken as distributions, f n — F 

A. A 

in the weak sense above. Hence f n — F also in the weak sense, but 
since f n converges in the Lr norm to /, we see that F is the function /. 
Similar arguments hold for / G L p (M d ), with 1 < p < 2, and / defined in 
L^{R d ), l/p^-l/q = :1, in accordance with the Hausdorff-Young theorem 
in Section 2 of the previous chapter. 

Let us next remark that the usual formal rules involving differentiation 
and multiplication by monomials apply to the Fourier transform in this 
general context. Thus, if F G <S*, we have 

{d^F) A = (2ttzx) q F a , 


8 See Section 1, Chapter 5 in Book III. 
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because 

(d^F) A (^) = d^F(^) 

= (-l)j 却咒 (，)） 

=F (((27rix) a <p) A ) 

=(27rix) a F A ((f). 

Similarly ((—27rza:) a F) A = d^(F A ). One should also observe that if 1 is 
the function that is identically equal to 1, then as tempered distributions 

A A 

1 = 5 and 5 = 1, 

and by the above 

((—27rza:) a ) A = d^S while (9^5) A = (27rix)° i . 

The following additional properties elucidate the nature of the Fourier 
transform in the context of tempered distributions. 

Proposition 1.5 Suppose F is a tempered distribution and^ G S. Then 
F is a slowly increasing C°° function，which when considered as a 
tempered distribution satisfies (F * ^) A = ^ A F A . 

Proof. The fact that F(^) is slowly increasing follows from the 
proposition in Section 1.4 together with the observation that for any 
function gT> and iV ， ||d S c(l + and more generally, 

W^x^xWn < C (! + I^D^II^IliV+lal. 

Since (F * = F{^ * (/p), it follows that (F * ^) a ((/p) : = F{^r *^ A ). 

On the other hand, ^ A F A ((p) = F A ( , ip A (p) = F(( r ip A (p) A ). Thus the de¬ 
sired identity, (F * / ijj) A ((p) = ( , ip A F A )((p) is proved because, as is easily 
verified, (# A (/?) A = #〜* (p A , 


Proposition 1.6 If F is a distribution of compact support then its Fourier 
transform is a slowly increasing C°° function. In fact, as a func¬ 
tion of ^ one has F A (^) = F(e^) where is the element of V given by 
e^(x) = 7](x)e~ 27ri,x ^ y with 7] a function in V that equals 1 in a neighbor¬ 
hood of the support of F. 

Proof. If we invoke Proposition 1.4, we see immediately that |F(e^)| < 
C||e^||iv < c^l + 1^1)^. By the same estimate, every difference quotient 
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of ^(e^) converges and |9^F(e^)| < c a (l + . Therefore F(e^) 

is C°° and slowly increasing. To prove that the function F(e^) is the 
Fourier transform of F it suffices to see that 


( 3 ) 


I = F((p) for every ip G S. 

JR d 


We prove this first when ip EV. 

Now by what we have already seen, the function g(^) = F(e^)ip(^) is 
continuous and certainly has compact support. Thus 






9(0 di = limS ^， 


where for each e > 0, S e is the (finite) sum e d However 

S e — F(s e ), with s e = e d ^ neIi d e n€ (x)(p(ne). Clearly as e — 0, we have 


s e (x) — rj(x) e- 2?r2X .V ⑹炎 = 

JR d 

in the || - ||^ norm. Thus, using Proposition 1.4 again, we get that S e —> 
F(r](f). Now since rj = 1 in a neighborhood of the support of F, then 
F(rjip) = F(ip). Altogether we have (3) when ip and to extend this 
result to G 5 it suffices to recall that V is dense in S. 


1.6 Distributions with point supports 

Unlike continuous functions, distributions can have isolated points as 
their support. This is the case of the Dirac delta function and each of its 
derivatives. That these examples represent essentially the general case 
of this phenomenon, is contained in the following theorem. 

Theorem 1.7 Suppose F is a distribution supported at the origin. Then 
F is a finite sum 

F — 〉: a a d=S. 

H<n 


That is ， 

L (-l) |a| a a (^V)(0), for (pGV. 
\a\<N 

The argument is based on the following. 
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Lemma 1.8 Suppose F\ is a distribution supported at the origin that 
satisfies for some N the following two conditions: 

(a) |Fi((^)| < c||^||iv ； for all ipGV. 

(b) F l {x a ) = 0,for all\a\ < N. 

Then Fi = 0. 

In fact, let rj 6 T>, with rj(x) = 0 for \x\ > 1 , and r](x) — 1 when \x\ < 1 / 2 , 
and write r] e (x) = r](x/e). Then since F\ is supported at the origin, 
Fi(7] e (p) = Moreover, by the same token F\{rj e x OL ) — F\{x a ) = 0 

for all I a I < TV, and hence 

F\{^p) = Fi (veMx) - C 0 ) xa ) 

V |«|<^ 

with p( a ) = d^ip(0). If R(x) = ip(x) — ^| a | <iV x a is the remain¬ 

der, then |i?(a:)| < c\x\ N+1 and \d^R(x)\ < c^\x\ N+l ~^\ when \f3\ < N. 
However \d^7] e (x)\ < c〆 - and d^r] e (x) = 0 if \x\ > e. Thus by Leib¬ 
nitz^ rule, \\rj e R\\N < ce, and our assumption (a) gives ((f)\ < c ; e, 
which yields the desired conclusion upon letting e — 0. 

Proceeding with the proof of the theorem, we now apply the above 
lemma to Fi = F — X]|a|<Ar a a9^S where N is the index that guaran¬ 
tees the conclusion of Proposition 1.4, while the a a are chosen so that 

a a — l -F(x a ). Then since d^{5){x (3 ) — (—1)1 … a!，if a = /3, and zero 
otherwise, we see that ^ = 0, which proves the theorem. 



2 Important examples of distributions 

Having described the elementary properties of distributions, we now in¬ 
tend to illustrate their occurrence in several areas of analysis. 

2.1 The Hilbert transform and pv(^) 

We consider the function 1/x, defined for real x with x ^ 0. As it stands, 
this function is not a distribution on R because it is not integrable near 
the origin. However, there is a distribution that can be naturally associ¬ 
ated to the function 1/x. It is defined as the principal value 

up i—> lim 
e—o 
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We observe first that the limit exists for every function (p ^ S. Assuming 
e < 1, we write 

(4) [ <p{x) — = f ip(x) — f <p{x)—. 

J\x\>e x Jl>\x\>e x J\x\>\ X 

The right most integral clearly converges because of the (rapid) decay 
of (p at infinity. As to the other integral on the right-hand side, we can 
write it as 

f w ⑻ - ^(Q) dx 

J^>\x\>e X 

because / I： >| x | >e 夸 = 0 due to the fact that 1/x is an odd function. 

However \^p{x) — (/p(0)| < c\x\ (with c = sup |(^ / (x)|), thus the limit as e 一 
0 of the left-hand side of (4) clearly exists. We denote this limit as 


Pv / ^f{x) 


R 


dx 


x 


It is also evident from the above that 


pv / ^f{x) 


F R 


dx 


x 


<c1k|| 


(where the norm || • ||i is defined in Section 1.4), and thus 


(/p i—^ pv 

is a tempered distribution. We denote this distribution by pv(^). 

As the reader may have guessed, the distribution pv(-) is intimately 
connected with the Hilbert transform H studied in the previous chapter. 
We observe first that 

(5) i/(/) = X)*/，for / E 5. 

Indeed, according to the definition of pv(^) and the definition of the 
convolution, we have 








|j/|>e 


f ( 工 一 y) 


dy_ 

y 
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x 


x — it x ie 


x 2 + e 


2 


We claim first that 


⑹ 


x 


x 2 + e 2 


pv 


as e — 0 


x 


and this limit exists for every x. However Proposition 3.1 of the previous 
chapter asserts that the right-hand side also converges in the L 2 (R) norm 
to H(f) as e —> 0, whenever / G L 2 (R). Thus the convolution 》 P V ( 士 ) * 
f equals the L 2 function H(f). 

We now give several alternate formulations of pv(^). The meanings 
of the abbreviations used will be explained in the proof of the theorem 
below. 

Theorem 2.1 The distribution pv(^) equals: 

⑷羞 ( lo gW). 

( b ) i (^io + ^ho)- 

Also, its Fourier transform equals j sign(x). 

Regarding (a), note that log \x\ is a locally integrable function. Here 
羞 (log |x|) is its derivative taken as a distribution. Now in that sense 


_d_ 

dx 


log|x| {if) 


•OO 


(log \ x \)~^~ ^ x -> for every ip ^ S. 


However the integral is the limit as e — 0 of — J^ >e (log |x|)^ dx^ and 
an integration by parts shows that this equals 


咖） 


|x|>< 


X 


dx \og{e)[ip{e) - ip{-e)] 


Moreover, (p(e) — p(—e) = O(e) since in particular ip is of class C 1 . There¬ 
fore log(e)[c^(e) — (p(—e)] — 0 as e — 0, and we have established (a). 

We turn to conclusion (b) and consider for e > 0 the bounded function 
l/(x — ie). We will see that as e — 0, the function l/(x — ie) converges 
to a limit in the sense of distributions, which we denote by l/(x — zO). 
We will also see that l/(x — iO) = pv ( 士 ) + inS. Similarly, lim e —o 1/($ + 
ie) = l/(x + zO) will exist and equals pv(^) — inS. To prove this, we are 
thus lead to the function 
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in the sense of distributions. 

We are dealing in effect with the conjugate Poisson kernel Q e (x ) 二 
-^, defined in Section 3.1 of the previous chapter. The argument 
there, after the identities (18), shows that 

— j (f(x) — — I (p(x)Q € (x) dx = I (p(x)A € (x) dx 

n J\x\>e x Jr Jr 

= [(p(x) — (p(0)]A € (x) dx + / ip(x)A € (x) dx, 

«/|x| < 1 J\x\>l 


since △ e (x) is an odd function of x. This function satisfies the estimate 
{△e ㈤ I < A/t^ and |A e (x)| < At/x 2, . Moreover if (/? G P, then \^{x) — 
^(0)| < c|x| and (p is bounded on R. Therefore 


j (p(x)A e (x) dx 

Jr 



x dx 1 


e<|x 匕 1 



The expression on the right is clearly 0(e|loge|) as e —> 0, and hence 
tends to zero. Therefore we have established (6). Next, recall the iden¬ 
tity (15) in the previous chapter 

= Vy(x) + iQy(x), z = x + 

17TZ 

where V y (x) is the Poisson kernel ^ x ^ y 2 . By letting y = e > 0, and 
taking complex conjugates we see 

— ^- 7 - = 7rQ € (x) + inV e (x). 

X — It 


Since the V y form an approximation to the identity (see Chapter 3 in 
Book III) or by an argument very similar as the one just given for Q y , 
we have that 6 as e — 0. Thus 


We may take complex conjugates of the above identity and also obtain, 
as a limit in the sense of distributions, 


x + iO 


pv ( 金 )- 


inS. 


Adding these two gives conclusion (b). Notice that incidentally, we have 
obtained the identity 


x ~ iO x + i0 





2. Important examples of distributions 


115 


To prove the last statement of the theorem we consider the Fourier 
transform of xj(x 2 + e 2 ) taken in the sense of distributions. By (17) in 
Section 3.1 of the previous chapter we have that 



f (— 工 ) 


xdx 
x 2 + e 2 


Jr 1 




for all / G L 2 (R), and this holds in particular for f G S. Substituting 
for /,/ = 々 ， （and noting that (^p A ) A = (p(—x)) we get 


Letting e —> 0, this yields 

\ x J Jr 1 

which shows that (pv 士 ) A is the function jsign(^), and the proof of the 
theorem is concluded. 

Let us remark that we have seen from the above that the distribu¬ 
tions l/(x — zO), l/(x + zO), and pv(^), while different, all agree with 
the function 1/x away from the origin. 


2.2 Homogeneous distributions 

We pass to the next topic by observing that pv(-) is a homogeneous 
distribution. To define this notion, recall that a function / defined on 
— {0} is said to be homogeneous of degree A, if f a = a A /, for every 
a > 0, where f a [x) = f(ax). Now the dilation F a of a distribution F has 
been defined by duality: 


Fai^p) = F((p a ), 

where (p a is the dual dilation of ip, that is, ip a = a~ d (f a -i . We can in¬ 
cidentally define the dual dilation F a by F a (ip) = F(jp a \ and note that 
F a = a_ d F a 一 i. 

In view of the above, a distribution F is said to be homogeneous of 
degree A, if F a = a x F for all a > 0. 


Now the function 1/x is clearly homogeneous of degree —1, but what 
is significant for us is that the distribution pv(^) is homogenous of de- 
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gree — 1. In fact 



The next to the last identity follows from making the change of vari¬ 
ables x ^ ax and noting that dx/x remains unchanged. The reader may 
also verify that the distributions l/(x — zO), l/(x + zO), and S are also 
homogenous of degree —1. 

There is an important interplay between homogeneous distributions 
and the Fourier transform. A hint that this may be so is the elementary 
identity ((p a ) A = (p A ) a , that holds for all (p E S, where (p a and (p a are 
the dilations of (p defined earlier. The simplest proposition containing 
this idea is the following. 

Proposition 2.2 Suppose F is a tempered distribution on that is 
homogeneous of degree A. Then its Fourier transform F A is homogeneous 
of degree —d — A. 


Remark. The restriction that F be tempered is unnecessary. It can 
be shown that any homogeneous distribution is automatically tempered. 
See Exercise 8 for this result. 

To deal with (F A ) a we write successively, 


(F A )aM = = F((^) A ) = F((^ A ) a ) 

=F a {ip A ) = a- d F a —i[if、= a~ d ~ x F{^) = a~ d - x F A {^). 
Thus (F A ) a = a~ d ~ x F A , as was to be proved. 


A particularly interesting example arises if we consider the function 
x\ x which is homogeneous of degree A and locally integrable if A > —d. 
Let H\ denote the corresponding distribution (for A > —d); this is clearly 
tempered. 

The following identity holds. 

Theorem 2.3 // —d < A < 0 ， then 

{H X ) A = c x H - d _ x ， with c A = :((;)) 7r~ d / 2 ~ A . 
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^.d/2 - \-X/2 — l At ^ __ d/2 —A/2p 


Inserting this in /o°° t~ d ^ 2 e~ n ^ ^ip(x) dt dx yields 
7T A / 2 r(—A/2) f \x\ X ip(x) dx = 

JR d 

7T _£i/2_A/2 r(d/2 + A/2) [ \x\~ d ~ x if{x)dx, 

JR d 

and this is our theorem. 

The principal value distribution pv(^) and the H\ just considered 
have in common the property that these distributions agree with C°° 
functions when tested away from the origin. We formulate this notion 
in the following definition. We say that a distribution K is regular 
if there exists a function k that is C°° in — {0}，so that K(<f)= 
f Rd k(x)(p(x) dx for all ip GV whose supports are disjoint from the origin. 


Note that the assumption A < 0 guarantees that —d — A > —d so that 
\x\~ d ~ X , which defines is again locally integrable. 

To prove the theorem we start with the fact that ^{x) = e* _7r l x l 2 is its 
own Fourier transform. Then since (t/; a ) A = (0 八 ） a we get (with a = t 1 ^ 2 ) 

f (p{x)dx — t~ d ^ 2 f e~ n ^ 2 ^(f(x) dx. 

JR d JRd 

We now multiply both sides by and integrate over (0, oo), and 

then interchange the order of integration. We note that 

[e~ tA t~ x,2 ~ l dt = A x/2 Y{-\/2), 


if ^4 > 0 and A > 0, by making the indicated change of variables that 
reduces the identity to the case A = 1. Thus using the above identity 
with A = 7r|x| 2 , we get 

f f e~ nt ^ 2 (p(x)t~ x ^ 2 ~ 1 dtdx = 7r A / 2 r(—A/2) f \x\ X (f(x) dx. 

JR d JO JR d 

Similarly, we deal with f^° t~ d / 2 t~ x ^ 2 ~ l e~ A ^ dt by making the change 
of variables t —> 1/t which shows that this integral equals 
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We also refer to this by saying that K is C°° away from the origin, 
and calling k the function associated to K• (Note that k is uniquely 
determined by K.) One should remark that the function associated to 
pv(*) is 1/x. 

Returning to the general case one may observe that the function k is 
automatically homogeneous of degree A if the distribution K is homoge¬ 
neous of degree A. In fact, if E P is supported away from the origin, 
K(ip) = f k(x)ip(x) dx, while 


Ka(if) = K(if a ) = a 


-d 



k(x)ip(x/a) dx 



k a (x)ip(x) dx. 


Hence 

/ (a x k(x) — k a (x))(p(x) dx = 0 
JR d 

for all such p，which means that k a (x) = a A /c(x). 

The above considerations and examples raise the following two ques¬ 
tions. 

Question 1. Given a function fc, homogeneous of degree A, and C°° 
away from the origin, when does there exist a regular homogeneous dis¬ 
tribution K of degree A such that k is its associated function? If such a 
distribution exists, to what extent is it uniquely determined by kl 

Question 2. How do we characterize the Fourier transform of such Kl 

We answer first the second question. 


Theorem 2.4 The Fourier transform of a regular homogeneous distri¬ 
bution K of degree X is a regular homogeneous distribution of degree 
—d — X， and conversely. 

Proof. We already know from Proposition 2.2 that K A is homoge¬ 
neous of degree —d — A. To prove that K A agrees with a C°° function 
away from the origin, we decompose K = Kq 4 - K \， with Kq supported 
near the origin and K\ supported away from the origin. To do this, 
fix a cut-off function r] that is C°°, is supported in \x\ < 1, and that 
equals 1 on \x\ < 1/2. Write Kq = r]K ， K\ = (1 — r])K. In particular 
K\ is the function (1 — rj)k, since l — rj vanishes near the origin. Also 
K A = K^-h K^. 

Now by Proposition 1.6, Kq is an (everywhere) C°° function. To 
prove that is C°° away from the origin we observe that by the usual 
manipulations of the Fourier transform valid for tempered distributions, 

⑺ （ —4 兀 2 阶广％ ( 对卜 (A N [(-27rix) a K l ]) A . 
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Recall that A denotes the Laplacian, A = d 2 jdx\ + . •.十 d 2 /dx 2 d . 

Now when |x| >\^K\ = fc, so there d^(Ki) is a bounded homogeneous 
function of degree A — |/3| and thus is 0(\x\ x ~^^), for |x| > 1. Therefore 
/S N [x a Ki} is 0(|x| A+ M— 2iV ) for |x| > 1 while it is certainly a bounded 
function for |x| < 1. Hence for N sufficiently large (2N > A 十 |a| + d) 
this function belongs to L 1 (M d ). As a result its Fourier transform is 
continuous. (See Chapter 2 in Book III.) This shows by (7) that d^(K^) 
agrees with a continuous function away from the origin. Since this holds 
for every a y it follows from Exercise 2 that is a C°° function away 
from the origin, as desired. 

Note that since the inverse Fourier transform is the Fourier transform 
followed by reflection, that is, K v = (if A ) 〜， the converse is a conse¬ 
quence of the direction we have just proved. 

We now turn to the first question raised above. 

Theorem 2.5 Suppose k is a given C°° function on R d — {0} that is 
homogeneous of degree A. 

(a) If X is not of the form —d — m, with m a non-negative integer，then 
there exists a unique distribution K homogeneous of degree A that 
agrees with k away from the origin. 

(b) If\ = -d-m f where m is a non-negative integer, then there exists 
a distribution K as in (a) if and only if k satisfies the cancelation 
condition 



x a k(x) da(x) = 0, 


for all \a\ = m. 


(c) Every distribution arising in (b) is of the form 

|a|=m 


Proof. We deal first with the question of constructing the distri¬ 
bution K given by k. Note that the function k automatically satisfies 
the bound \k(x)\ < c|x| A . Indeed, k(x)/\x\ x is homogeneous of degree 0 
and is bounded on the unit sphere (by continuity of k there), thus it is 
bounded throughout R d — {0}. 

So if A > —d, the function k is locally integrable on M d , and thus we 
can take K to be the distribution defined by k. This local integrability 
fails when A < —d. 
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In the general case we shall proceed by analytic continuation. Our 
starting point is the integral 

(8) I(s) = I(s)((p) = I k(x)\x\~ x ~ hs (p(x) dx, with p 6 5, 

JR d 

initially defined for complex s with Re(s) > —d, which we will see con¬ 
tinues to a meromorphic function in the entire complex plane. We will 
then ultimately set 

K{^p) = I{ s )\ s= x - 

In fact, for our given homogeneous function fc, and (p any test function 
in 5, we note by the above bound on fc, that the integral (8) converges 
when Re(s) > —d, thus I is analytic in that half-plane. Moreover I 
continues to the whole complex plane, with at most simple poles at s = 
— d, _ d _ 1 , •. ., _ d _ tti, .... 

To prove this, write I(s) = ‘ Given the rapid decrease 

of (p at infinity, the integral over |x| > 1 gives an entire function of s. 
However, for every TV > 0, 


( 9 ) 

k(x)\x\~ x+s x a dx-\- 
+ I k(x)\x\~ x+s R(x) dx^ 

where R(x) = (f(x) - Ej a |<;v C 工 ' with p ⑷ (0) = d^(p{0). 

Now by the homogeneity of k and the use of polar coordinates, we see 
that 


kl<i 


k(x)\x\~ x+s (p(x) dx — 


p( a )(0) 


TV 


a\ 


kl<i 


kl<i 


k(x)\x\ 




x a dx 




k(x)x a da(x) )1 ?+|a|+d—1 咖 


with the last integral equalling 1/(5 + |a| + d). Moreover the remainder 
R(x) satisfies |i?(a:)| < c\x\ N , and this together with |/c(x)| < c\x\ x im¬ 
plies that (<1 k(x)\x\~ x+s R(x) dx is analytic in the half-plane Re(s) > 
-d-N. X ~ 

As a result, for each non-negative integer N, we have that I(s) can be 
continued in the half-plane Re(s) > —d — N and can be represented as 


T ( s )= 


C a 


|q|<N 


s + |a| + d 


+ 
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in that half-plane, with E^(s) analytic there, and 

C a = P (Q) ( 0 ) ( f k{x)x a da{x) 

Now for our given A with A 7 ^ —d, —d — 1 , … we need only to take N so 
large that A > —d — TV, and define the distribution K by setting K(ip) = 
/(A). (See Figure 1 .) Moreover, by keeping track of the bounds that 
arise, one sees that \K(ip)\ < c||(^||m, with M > max(iV + 1, A + d + 1), 
with the norm | 卜 ||m defined earlier. Thus X is a tempered distribution. 




Re ⑷ > -d — N 




s = A 


-_ - • . 

1 

-d- 

• • 

N -d - N +1 

-d-l 

■ 

-d 


Figure 1 . The half-plane Re(s) > —d — N ， and the definition of /(A) 


To verify that K agrees with the function k away from the origin, we 
note that whenever (p vanishes near the origin, the integral I(s) converges 
for every complex s and is an entire function. Therefore by ( 8 ) 

K(ip) = /(A) = j k(x)ip(x) dx, 

JR d 

This proves the claim. 

Next notice that for any a > 0 , whenever Re(s) > —d, 

I(s)((p a ) = f k(x)\x\~ x ^~ s a~ d ip(x/a) dx 
JR d 

=a s f k(x)\x\~ x ^ s (p(x) dx = a s I(s)((p). 

JR d 

This follows by the homogeneity of fc, and the change of variables x ^ ax. 
As a result, I(s)(ip a ) = a s I(s) when Re ⑷〉 —d ， and thus by analytic 
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continuation this continues to hold at all s at which I(s) is analytic, and 
hence at s = X. Therefore the distribution K — /(A) has the asserted 
homogeneity, and this proves the existence stated in part (a) of the theo¬ 
rem. If we also note that under the cancelation conditions of part (b) of 
the theorem one has C a = 0 whenever |a| = m, our argument also proves 
the existence in that case. 

We next come to the question of the uniqueness of the distribution K 
when A ^ —d, —d — 1 ,_ Suppose K and K\ are a pair of regular dis¬ 

tributions of degree A, each of which agrees with k away from the ori¬ 
gin. Then D = K — K\ is supported at the origin and hence, by Theo¬ 
rem 1.7, D = ^| q | <m c a d^S for some constants c a . Now on the one hand 

D(^p a ) = : a x D((p), because K and K\ are homogeneous of degree A. On 
the other hand d^S((p a ) = a~ d ~^d^S((p) ) and as a result, 

a x D(ip)=c a d^5((p)a~ d ~^ for all a > 0. 

\a\<M 


We now invoke the following simple observation, which we state in a form 
that will also be useful later. 


Lemma 2.6 Suppose Ai, 入 2 , ... ， A n ， are distinct real numbers and that 
for constants aj and bj, I < j < n ? we have 

n 

E (〜 x Xj + bjx Xj log x) = 0 for all x > 0. 

j < n. 

For A 7 ^ — d, —d — 1, … ， we apply the lemma to Ai = A, A 2 = —d, 
A 3 = —d — 1 , and so on, and x — to obtain D(ip) = 0 as desired. 
If A = —d — m, we get D(ip) = 5Z|a|=m proving the relative 

uniqueness asserted in conclusion (c) of the theorem. 

To prove the lemma we assume, as one may, that A n is the largest of 
the Aj’s. Then multiplying the identity by x~ Xn and letting x tend to 
infinity we see that b n as well as a n must vanish. Thus we are reduced 
to the case when n is replaced by n — 1, and this induction gives the 
lemma. 


Then a j = bj = 0 for all 1 < 


Finally we show that when A = —d — m and k(x)x a da(x) ^ 0, 

for some a, with |a| = m, then there does not exist a homogeneous dis¬ 
tribution of degree —d — m that agrees with k(x) away from the origin. 
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We consider first the case m 二 0, and examine I(s), given by ( 8 )，near 
s = —d, in the special case k(x) = |a:| _d . In this case we use (9) with 
TV = 1 ， which is valid for Re(s) > —d — 1. With R(x) = (f(x) — p(0) this 
yields 
( 10 ) 

I{s)((p) = A d 4 - [ [(f(x) - (p(0)]\x\ s dx -f ip{x)\x\ s dx. 

S + d ^|x|<l J\x\>\ 

(Here = 2n d ^ 2 /T(d/2) denotes the area of the unit sphere in R d ). 
Since the two integrals are analytic when Re(s) > —d — 1, the factor 
Ad^p(0) represents the residue of the pole of I(s)((p) as 5 = —d, and in 
particular, as distributions 

(5 d?jl (5) - > B.S s ~*>■ — d. 

We will temporarily call J the distribution that arises as the next term 
in the expression of I(s) as s — —d, I(s) =+ J + 0(s 4 - d), that is 

J = ((s + d)I(s)Y s= _ d . 


This distribution J, which we shall now write as | ， is given, because 
of ( 10 )，by 


( 11 ) 


x\ d 


(^) 


l^l<! 


咖）- ^( Q ) 

x\ d 


dx + 




x\> 


X 


\d 


dx. 


We observe the following facts about 


X 


d 


(i) It is a tempered distribution. Indeed, it is easily verified that 

■Vl (^) < c||^||i. 


X 


(ii) 




agrees with the function l/|a:| d away from the origin; this is 

because when tested with that vanishes near the origin, the term 
p( 0 ) disappears from ( 11 ). 


(iii) However, 




is not homogeneous. 


What holds is the identity 

( 12 ) 


x 


\d 


«) = a 


-d 


X 


d 


((f) + a~ d log ⑷ A^( 0 ), for all a > 0 . 
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To prove this note that 


x 


d 


(，) 


a 


d 


' X 


dx 


d 


十 a 


d 


|x| 〉 l/a \ X 


d 


as a change of variable shows. A comparison of this with the case a = 1 
immediately yields (12). A consequence of this identity is contained in 
the following. 

Corollary 2.7 There is no distribution Kq that is homogeneous of de¬ 
gree —d and that agrees with the function l/|x| d away from the origin. 


If such a Kq existed, then Kq — 

origin, and hence equal to Xl|a|<M 
(f a would yield that 



would be supported at the 
Applying this difference to 


a~ d K 0 {^)-a- d (W- a - d k)g( a )A^(0) — 

. x \ . 

- E 二 0, 

\a\<M 


for all 〉 0• TThis leads to a contradiction 'with Lemma 2• 6 if 'we take 
so that (^(0) 7 ^ 0. 

The result of Corollary 2.7 can be restated as follows. If k is homo¬ 
geneous of degree —d, and Jj x | =1 k{x) da(x) ^ 0, then there is no distri¬ 
bution K homogeneous of degree —d, that agrees with k away from the 
origin. 

Indeed, write k(x) = 十 fci(x), where 

c / da(x) I k(x) da(x), 

J\x\ = l J\x\ = l 

and c 乂 0, while Jj x | =1 k\(x) da(x) = 0. Now if K\ is the distribution 
whose associated function is fci, and whose existence is guaranteed by 
conclusion (b), then ^(K — K\) would be a homogeneous distribution of 
degree —d agreeing with l/|x| d away from the origin. This we have seen 
is precluded by Corollary 2.7. 

Finally, turning to the general case, suppose X is a homogeneous dis¬ 
tribution of degree —d — m, whose associated function is k(x). Let K f = 
x a K for some a with \a\ = m and /| x | =1 k(x)x a da(x) ^ 0. Then clearly 
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K 1 is homogeneous of degree —d — m + \a\ = —d, while k f (x) = x a k(x) 
is its associated function. However now 

I k\x) da(x) 7 ^ 0 , 

J\x\ = \ 

which contradicts the special case A = —d considered above. The theo¬ 
rem is therefore completely proved. 


Remark 1. The results of the theorems continue to hold with minor 
modifications if A, which was assumed to be real, is allowed to be com¬ 
plex. In this situation the proof of Lemma 2.6 needs a slight additional 
argument, which is indicated in Exercise 20. 

Remark 2. When X = —d with k satisfying the cancelation condition 
k(x) da(x) = 0, the resulting distribution K is then a natural gen¬ 
eralization of pv(^) in R considered earlier. Indeed, as we have seen 

K((p) = / k(x)[(p(x) — (^(0)] dx -k(x)(p(x) dx 

J\x\<l J\x\>l 

and this equals the “principal value” 

k(x)ip(x) dx 

<\^\ 



because 




k(x) da(x) = 0 . 


Distributions 


of this kind, first studied by Mihlin, Calderon and Zygmund, are often 
denoted by pv(fc). 


2.3 Fundamental solutions 

Among the most significant examples of distributions are fundamental 
solutions of partial differential equations and derivatives of these funda¬ 
mental solutions. Suppose L is a partial differential operator 

L = clol^x on M d , 

\ot\<m 


with a a complex constants. A fundamental solution of L is a distri¬ 
bution F so that 


L(F) = S, 


where S is the Dirac delta function. The importance of a fundamental 
solution 9 is that it implies that the operator / T(/) = F * /, mapping 


9 Note that a fundamental solution is not unique since we can always add to it a solution 
of the homogeneous equation L(u) = 0. 
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V to C°°, is an “inverse” to L. One way to interpret this is the statement 
that 

LT 二 TL 二 I 


when acting on V. This holds because as we have seen earlier in this 
chapter, d^(F * /) = (dgF) * / = F * {d^f) for all a, hence L(F 氺 f)= 
(LF) * / = F * (L/), while of course S * f = f. 

Now let 

p(o= 〜 ( 2 喊 )' 

|a| <m 


be the characteristic polynomial of the operator L. Since, for example 
when / belongs to 5, one has (L(/)) A = P - / A , we might hope to find 

A 

such an F by defining it via F(^) = 1/P ⑹ or as 


(13) 


F 


P(0 


e 


2-Kix- 


处， 


taken in an appropriate sense. 

The main problem with this approach in the general case is due to the 
zeros of P and the resulting difficulty of defining 1/P ⑹ as a distribution. 
However in a number of interesting cases this can de done quite directly. 

We consider first the Laplacian 


d Q2 

A = ^ in R d . 

Here 1/P(^) = 1/(— 4 兀 2 |(| 2 )， and when d> 3 this function is locally in- 
tegrable, and the required calculation of a fundamental solution is given 
by Theorem 2.3. This results in the following. 

Theorem 2.8 For d> 3, the locally integrable function F defined by 
F(x) = Cd\x\~ d+2 is a fundamental solution for the operator △， with 



4tt2 


This follows by taking A = —d 2 (in Theorem 2.3), then T = 

r(l) = 1, while T(d/2) = (d/2 — l)T(d/2 — 1). Therefore F ⑹ equals 
1/(—47r 2 |^| 2 ), and hence 

(AF) A = 1, which means AF = 5. 


The case of two dimensions leads to the following variant. 
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Theorem 2.9 When d = 2, the locally integrable function ~ log |x| is a 
fundamental solution of A. 

This fundamental solution arises when considering the limiting case A —> 
一 d + 2 = 0 in Theorem 2.3. It can be given formally as 

^e 2nix ^d^ 



but we need to assign a meaning to this non-convergent integral. In fact, 
we shall be led to the distribution considered in ( 11 ). We start 

with the identity 

(14) [ (f(x)\x\ x dx = c x [ ^p(0\^\~ X ~ 2 d^ 

Jr 2 Jr 2 


with —2 < A < 0 , and c\ = 7r ~ 1 ~ X - We examine (14) near A = 0 

and use the fact that c\ ~ — A/(27r) + c’A 2 as A —> 0， for some constant 
d. This follows from the fact that r(l) = 1， the function T(s) is smooth 
near 5 = 1 , and the identity T(s + 1 ) = sT(s) with 5 = —A/2. Looking 
back at (10) (with s = —A — 2 )，we differentiate both sides of (14) with 
respect to A, which is justified by the rapid decay of ip and (p. After a 
multiplication of 1/2tt the result is, upon letting A — 0, 


2 丌 


r R 2 


(f(x) log |x| dx 


1 


4 丌 2 



kl<i 


p(x) - ip(0) 

a :| 2 


dx + I cp(x)~ X 

|x|>l \ X 


2 


} 


c’p(0) 


That is, if we take F — ^ log |x|, then 


F 


47T 2 


X 


2 


-c'5. 


Now it is clear that \x\ 2 5 = 0 , because \x\ 2 S(ip) = \x\ 2 ip(x)\ x= o = 0 . Also, 


for all ip & S, \x\ 2 -^2 ( 9 ) = / R 2 ^f{ x ) dx, which means \x 


12 


RF 


equals 


the function 1 . Thus (AF) A = —A7r 2 \x\ 2 F = 1, and so AF — 5, proving 
that F is a fundamental solution for A on R 2 . 


We shall next give an explicit fundamental solution for the heat oper¬ 
ator 


L = 


d 

Ft 




128 


Chapter 3. DISTRIBUTIONS ： GENERALIZED FUNCTIONS 


taken over R d+1 , with (x y t) E x R, and A x the Laplacian 

in the a:-variables, x E We do this by linking the inhomogeneous 
equation L(u) = g with the homogeneous initial-value problem, L(u) = 0 
for ^ > 0 with u(x, t)\t=o = f{x) given on M. d . 

Recall from Chapters 5 and 6 in Book I that the latter problem is 
solved by the heat kernel 

n^(0 = e -^ 2 \ 


where the Fourier transform is taken only in the x-variables. This shows 
that if / E 5, then u(x, t) = {TL t * f)(x) solves the equation L(u) = 0, 
while u(x, t) —> f(x) in 5 as ^ > 0. Notice also that 

= A x H t (x), and f H t {x)dx = 1, 

饥 JRd 


and TLt is an “approximation to the identity.” (For these properties of Hu 
see Chapter 5, Book I and Chapter 3 in Book III.) 

Now on R d+1 define F by 


勝 {，’ 以》 


It follows that F is locally integrable on R d+1 (and in fact one has 

^ — 只 )， an d so F defines a tempered distribution 

on R d+1 . 

Theorem 2.10 F is a fundamental solution of L = 备 — A x . 

Proof. Since LF((p) = F[L’<p) with V — — A x , it suffices to see 

that F 、 L’(p ) ， which equals 


lim 

e—0 



ip(x, t) dx dt, 


is 5((p) = ip(0, 0). 

Now F(x,t) = T~it{x) when t > 0, so an integration by parts in the 
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^-variables gives 



I ?{ e (x)ip(x, e) dx. 
JR d 


However, because E 5, one has |(^(x,e) — (^(x,0)| < O(e) uniformly 
in x. Therefore 

/ H e (x)ip(x ， e) dx = Ti e (x)(ip(x, 0) + O(e)) dx^ 

JR d JR d 

and this tends to p(0,0), since Tit is an approximation to the identity. 

An alternate proof can be given by computing the Fourier transform 
of F, as in Exercise 21. 


2.4 Fundamental solution to general partial differential equa¬ 
tions with constant coefficients 

We now tackle the general case of any constant coefficient partial dif¬ 
ferential operator L on R d by addressing the convergence issues raised 
by (13), where a candidate for a fundamental solution F was written as 


F = 




2nix-^ 


处， 


with P the characteristic polynomial of the operator L. Ignoring for 
a moment the problem of convergence, we note that if cp G V, then we 
would have 

FM 二卜 (x) f 

JRd JRd 

and hence after interchanging the order of integration, 




130 Chapter 3. DISTRIBUTIONS: GENERALIZED FUNCTIONS 

To circumvent the obstacle that arises in (15) because of possible zeroes 
of P, we shift the line of integration in the 匕 -variable to avoid any zeroes 
of the polynomial p(z) = P(z, where ^ = (( 2 ,... ，匕 ） is fixed. The 
result we obtain is as follows. 


Theorem 2.11 Every constant coefficient (linear) partial differential 
equation L on R d has a fundamental solution. 

Proof. After a possible change of coordinates consisting of a rotation 
and multiplication by a constant, we may assume that the characteristic 
polynomial of L will be of the form 

m — 1 

j=0 


where each Q f - is a polynomial of degree at most m — j. A proof that 
a general polynomial P can be written in the above form, can be found 
for instance in Section 3, Chapter 5, Book III, where an earlier version 
of the “invertibility” of 1 / appears. 

For each (’，the polynomial p(z) = P(z, has m roots in C, which 
can be ordered lexicographically, say o ； i(^ / ),..., a m (^ / ). We claim that 
we can pick an integer n((’）so that: 

(i) |n((’)| < m + 1 for all 

(ii) If Im(^i) = n(0, then |^i — aj ((’)| 》 1 for all j = 1 ,. • •, m. 

(iii) The function ^ ^ n(^ ; ) is measurable. 


Indeed, for each ^ the polynomial p has m zeroes, so at least one of the 
m + 1 intervals Ii = [~m — 1 —m — 1 + 2(£ + 1 )) (for £ = 0 , … ， m) 
has the property that it does not contain any of the imaginary parts of 
the zeroes of p. We can then set n(^ ; ) to be the mid-point of such 
interval la with the smallest £ having the above property. Condition (ii) 
is then automatically satisfied. Finally, Rouche’s theorem 10 applied to 
small circles around the zeroes of p shows that , 0 : 771 (^) are 

continuous functions of f’，and this implies (iii). 

So, instead of (15) we now define 


(16) F ㈣ 





P(0 




whenever (f €V. 


10 See for example Chapter 3 in Book II. 
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In the above, the inner integral is taken over the line {Im(^i ) 二 哝 ，)} 
in the complex ^i-space. 

To see that F is well defined as a distribution, recall first that since (p 
has compact support then (p is analytic with rapid decay on each line 
parallel to the real axis, so it suffices to show that P is uniformly bounded 
from below on the line of integration. To this end, fix ^ on such line, and 
consider the polynomial in one variable q(z) = P((,i + (’). Then g is a 

polynomial of degree m with leading coefficient 1, so if Ai,, A m denote 
the roots of then q(z) = (z — Ai) ― • (z — A m )_ By (ii) above we have 
that |Aj| > 1 for all j, hence |P(^)| = |^(0)| = |Ai • • • A m | > 1, as desired. 
Therefore F defines a distribution. 

Finally, the rapid decrease also allows us to differentiate under the in¬ 
tegral sign, so if V = S|a|<m a «( — ? then the characteristic poly¬ 
nomial of L f is P(—therefore (L f (cp)) A = P(—^)^(^). Hence 

(LF)(^p) = F(L f ⑽) = f f 狄-⑽ 

We can now deform the contour of integration back to the real line, so 
that 

(LF)(ip) = ( 0(-() 炎 =p(0) 二 S((p), 

JR d 

which completes the proof of the theorem. 

Remark. We obtain from this the following existence theorem: when¬ 
ever / G Co°(M d ), there exists a u G C°°(M. d ) so that L(u) = f. This is 
clear if we take u = F * f, with F the fundamental solution above. 11 It 
should also be pointed out that an analogous solvability fails if L is not 
constant-coefficient, as is seen in Section 8.3 of Chapter 7. 

2.5 Parametrices and regularity for elliptic equations 

In many instances it is convenient to replace the notion of a fundamental 
solution by a more flexible variant, that of an “approximate fundamental 
solution” or parametrix. Given a differential operator L with constant 
coefficients, a parametrix for L is a distribution Q, so that 

LQ = S + r 

where the “error” r is in (say) S. In this sense, the difference LQ — S is 
small. 

11 This result may be compared with Section 3 in Chapter 5 of Book III, where not- 
necessarily smooth solutions are found by a different method. 
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Of particular interest are parametrices that are smooth away from the 
origin. Adopting the terminology used earlier, we say that Q is regular 
if this distribution agrees with a C°° function away from the origin. 

An important class of partial differential operators that have regu¬ 
lar parametrices are the elliptic operators. A given partial differential 
operator L — a a 9^, of order m, is said to be elliptic if its char¬ 

acteristic polynomial P satisfies the inequality |P(^)| > c|^| m , for some 
c > 0, and all sufficiently large Note that this is the same as assuming 
that P m , the principal part of P (the part of P which is homogeneous of 
degree m), has the property that P m (^) = 0 only when ^ = 0. 

Note, for example, that the Laplacian A is elliptic. 


Theorem 2.12 Every elliptic operator has a regular parametrix. 

Proof. Observe first by a straightforward inductive argument in 
that whenever |a| = k and P is any polynomial 



where each is a polynomial of degree < £m — k. 

Now suppose |P(^)| > c|^| m , whenever |^| > Ci, and let 7 be a C°° 
function which is equal to 1 for all large values of ^ and is supported in 
|^| > Ci. Then observe from the above identity that 

(17) 芩 ( 黑 ) mHa| . 

Now let Q be the tempered distribution whose Fourier transform is the 
(bounded) function 7(0/ 尸 (0. Taking up the same argument as in the 
proof of Theorem 2.4, we have 

((-47T 2 |x| 2 ) iV 9fQ) A = Af [(27riO /3 (7/^)]- 

Because of (17) and Leibnitz’s rule, the right-hand side above is clearly 
dominated by for |^| > 1 ; it is also bounded when |^| < 

1. Thus as soon as 2N + m — |/3| > d, this function is integrable, and 
therefore \x\ 2N being its inverse Fourier transform up to a multi¬ 
plicative constant, is continuous. Since this is true for each /3, we see 
that Q agrees with a C°° function away from the origin. 

Note moreover that (LQ) A = 尸 (0[7(0)/ 尸 (0] = 7(0 = 1 + (7(0 _ 
1 ). By its definition, 7 (^) — 1 is in and hence 7 (^) — 1 = r, for some 
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r £ S. Finally, {LQ) A = 1 + r, which means LQ 二 5 + r, as was to be 
shown. 

The following variant is useful. 

Corollary 2.13 Given any e > 0, the elliptic operator L has a regular 
parametrix Q e that is supported in the ball {x : \x\ < e}. 

In fact, let r] e be a cut-off function in P, that is 1 when \x\ < e/2, 
and that is supported where \x\ < e. Set = rjeQ, and observe that 
L(rj e Q) — t] € L(Q) involves only terms that are derivatives of of posi- 
tive order, and these vanish when \x\ < e/2. The difference is therefore a 
(7°° function. However, rj e L(Q) = r] e (6 + r) — 5 - {- r] € r. Altogether, this 
gives L(Q e ) = 5 + r € , where r e is a C°° function. Notice that r € is auto¬ 
matically also supported in \x\ < e. 

Elliptic operators satisfy the following basic regularity property. 

Theorem 2.14 Suppose the partial differential operator L has a regular 
parametrix. Assume U is a distribution given in an open set C and 
L(U) = f, with f a C°° function in Q. Then U is also a C°° function 
on fi. In particular, this holds whenever L is elliptic. 

Remark. The terminology hypo-elliptic is used to denote operators 
for which the above regularity holds. The prefix “hypo” reflects the 
fact that there are non-elliptic operators (for example the heat operator 
备 — A x ) that also have this property as a result of the fact that they 
have a regular fundamental solution. However, it should be noted that 
for general partial differential operators, hypo-ellipticity fails; a good 
example is the wave operator. (See Exercise 22 and Problem 7*.) 

Proof of the theorem. It suffices to show that U agrees with a C°° 
function on any ball B with S C fi. Fix such a ball (say of radius p), 
and let B\ be the concentric ball having radius /9 + e, with e 〉 0 so small 
that Bi C Next, choose a cut-off function r] in P, supported in fi, 
with r](x) = 1 in a neighborhood of B\. Define U\ = r]U. Then Ui and 
: F\ are distributions of compact support in M. d and moreover F\ 
agrees with a C°° function (that is, /) in a neighborhood of Thus 
Fi agrees in a smaller neighborhood of B\ with a C°° function fi that 
has compact support. 

We now apply the parametrix Q e supported in {\x\ < e} whose exis¬ 
tence is guaranteed by Corollary 2.13. On the one hand, 

Q e * L(Ui) — L(Q e ) * t/i = (5 + r € ) *t/i = U\ +r € *C/i, 
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and since r € * U\ = C/i * r € by Proposition 1.1, we have that r € * U\ is a 
C°° function in W 1 . On the other hand, 

Qe * L(Ui) = * Fi = Q e * /i + * (Fi - fi). 

Now again, Q e * /i is a C°° function, while by Proposition 1.3, Q e * (Fi — 
/i) is supported in the closure of the e-neighborhood of the support of 
F\ — /i. Since 巧 一 /i vanishes in a neighborhood of B\ it follows that 
Q e * (Fi — fi) vanishes in B. Altogether then [/i is a C°° function on 
B. Since Ui = rjU and r] equals 1 in S, then C/ is a C°° function in S, 
and the theorem is therefore proved. 


3 Calderon- Zygmund distributions and L p estimates 

We will now consider an important class of operators that generalize the 
Hilbert transform and that have a corresponding LP theory. These arise 
as “singular integrals,” that is, as convolution operators T given by 

( 18 ) r(/) = /*K ， 

with K that are appropriate distributions. Among kernels K of this kind 
the first considered were homogeneous distributions of critical degree —d, 
similar to those described in Remark 2 at the end of Section 2.2. 12 
Over time, various generalizations and extensions of these operators have 
arisen. Here we want to restrict our attention to a narrow but partic¬ 
ularly simple and useful class of such operators, which have the added 
feature that they can be defined either in terms of (18) or in terms of 
the Fourier transform via 

(19) (T/) A (0 = m(0/(0. 

The reciprocity of the resulting conditions on the kernel K and the mul¬ 
tiplier m, with m = K\ can then be seen as a generalization of Theo¬ 
rem 2.4 when A = —d. 

3.1 Defining properties 

We consider a distribution K that is “regular” in the terminology used 
in Sections 2.2 and 2.5. This means that for such K there is a function k 
that is C°° away from the origin so that K agrees with k away from the 
origin. Given a K of this kind, we consider the following differential 
inequalities for its associated function /c, 

(20) |5^/c(a:)| < c a |:r 「 d — a l ， for all a. 


12 Without however requiring a high degree of smoothness of k. 
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Notice that the above for a = 0, implies that the distribution K is tem¬ 
pered. 

In addition to (20) we formulate a cancelation condition as follows. 
Given an integer n, we say that p is a (7( n )-normalized bump function 
if p is a C°° function supported in the unit ball and 

sup \d^(f(x)\ < 1, all \a\ < n. 

X 

We define ip r by ^p r {x) = for r > 0. Our condition is then that for 

some fixed n > 1, there is an A so that 

(21) sup \K{if r )\ < for all (7( n )-normalized bump functions if. 

0<r 

Proposition 3.1 The following three properties of a distribution K are 
equivalent. 

(i) K is regular and satisfies the differential inequalities (20) together 
with the cancelation property (21). 

(ii) K is tempered, and m = is a function that is C°° away from 
the origin that satisfies 

(22) |#m ⑹ I < for all a. 

(iii) K is a regular distribution that satisfies the differential inequali¬ 
ties (20) and is a bounded function. 

We refer to kernels K that satisfy these equivalent properties as Calderon- 
Zygmund distributions. 13 

The proof will be facilitated by noting the dilation-invariance of the set 
of all distributions that satisfy the above conditions. Recall the scaling 
of a distribution K as defined in Section 2.1. For each a > 0, the scaled 
distribution K a is given by K a ((p) = K(ip a ), with (p a (x) — With 

this we claim that whenever K satisfies (20) and (21), K a satisfies (20) 
and (21) with the same bounds. In fact, the function associated to K a 
is a~ d k(x/a), while K a (ip r ) = K((f a r), as the reader may easily verify. 
Moreover, if m = K^\ then m a — (K a ) A , and m a (^) = m«)，so 
satisfies (22) with the same bounds. 

Once this is observed, the proof of the proposition is in the same spirit 
as that of Theorem 2.4, and so we will be correspondingly brief. Let us 


13 We should note that phrases like u Calderon-Zygmund operators” or “Calder6n- 
Zygmund kernels” have been used in many contexts to denote different but related objects 
in the theory. 
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begin by assuming condition (i). We first observe that m = K A is a C°° 
function away from the origin. This is done by splitting K as Ko + 
where Ko = r\K and K\ = (1 ~ rj)K (with r] a C°° cut-off function that 
is supported in the unit ball and is equal to 1 when |x| < 1/2)，and 
proceeding as in the proof of Theorem 2.4. 

To show that the inequalities (22) are satisfied for m(^) = K A ^ ^ ^ 0, 
we can reduce matters to the case |^| = 1 by the dilation-invariance 
pointed out above. Now by Proposition 1.6, Kq(^) = K{r]e~ 2，Klx ^), and 
the latter is K((f) with (f(x) — r](x)e~ 27rix '^. Now p is a multiple (inde¬ 
pendent of for |《| = 1) of a (7( n )-normalized bump function, so (20) 
implies ⑹ | g c’. The same argument gives 咚/^⑹丨 g c f a . 

Next, since K\ = (1 — rj)K = (1 — rj)k is supported where \x\ > 1/2, 
we have by (7) 

|^| 2N |^K 1 A (OI = c|(A N (a: a K 1 )) A | 

< c Q ^ f \x\~ d ^ a ^~ 2N dx < oc 

if 2N > \a\. Thus < c f a when |《| = 1， and therefore combining 

estimates for Kq and implies (ii) in the proposition. 

To prove that (ii) implies (i), we first assume that m satisfies (22) 
and, in addition, has bounded support, but we will make our estimates 
independent of the size of the support of m. 

Define K(x) = J Rd m(^)e 2nl ^' x d^. Then clearly K is a bounded C°° 
function on and K A = m in the sense of distributions. In proving 
the differential inequalities (20), it will be sufficient to do this for \x\ = 1, 
because of the dilation-invariance used earlier. Now write K = Kq -\- 
with Kj defined like K with m replaced by rrij, where mo(0 = ^(0^(0 
and mi(^) = — W(0). Now obviously \d^Ko(x)\ < c Q , since mo 

is bounded and is supported in the unit ball. Also in analogy with (7) 
and the previous argument, 

^\ 8^^)\=0 f △ f ( rm 1 ( 0 ) 炎 

JR d 

< c a ^N [ i^r^r 2iV ^ < oo 

if 27V - H d • Since — 1， these estiixiates for and ^ yield (20) 

for |x| = 1, and thus for all x ^ 0. 

To prove the cancelation condition, take n = d-\-1. Note first that 
(2 ttzO q ^(0 = so this implies that sup^(l + |^|)^ +1 |^(01 < c, 
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whenever f is a C( n )-normalized bump function, and as a result 



1^(01 < c 




'r^ (1 + ICl) 


d 


< c f 


for such a normalized bump functions. 

However, K(ip r ) — K r (ip) = f Tn r (—^)(p(^) d^. Therefore \K((f r )\ < 

A 

sup^ |m(^)| f |p(€)| < 乂 ， and the condition ( 21 ) is established. 

To dispense with the hypothesis that m has compact support, consider 
the family m t (^)= 爪⑹ "e ⑹， with e > 0 . Observe that each m e has 
compact support and (22) is satisfied uniformly in e. Set 


K e (x) = / di. 

JR d 

Then since m e — m pointwise and boundedly as e —> 0, the convergence 
is also in the sense of tempered distributions, and this implies the conver¬ 
gence of K t to K in the sense of tempered distributions, with K A = m. 
Now the differential inequalities ( 20 ) hold for x 7 ^ 0 , and K 。uniformly 
in e. Thus these estimates hold for K, (more precisely for its associated 
function k). Similarly, since the cancelation conditions (21) hold for 
uniformly in e, these conditions hold for K, and thus altogether we see 
that (ii) implies (i). We observe that the argument just given shows that 
(iii) implies (i). Since (iii) is clearly a consequence of (i) and (ii) together, 
all three conditions are equivalent, finishing the proof of the proposition. 


The following points may help clarify the nature of the hypotheses 
concerning Calderon-Zygmund distributions. 

• It is clear that if the cancelation condition holds for C( n )-normalized 
bump functions for a given n, then it also holds with n f > n. In 
the other direction, it can be shown that in the presence of ( 20 ), 
the fact that ( 21 ) holds for some n implies that it holds for n = 1 ， 
and thus for all n f > l. This is sketched in Exercise 32. 


• Given a function k that satisfies the differential inequalities (20), we 
may ask if there is a Calderon-Zygmund distribution K that has k 
as its associated function. The necessary and sufficient condition 
on k is that 


sup / k(x) dx 

0<a<b Ja<\x\<b 


< OC. 


The proof of this fact is outlined in Exercise 33. Note however 
that K is not uniquely determined by k. 
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• We make a last remark about the significance of Calderon-Zygmund 
distributions in the theory of partial differential equations. It is 
that, whenever Q is a parametrix for an elliptic operator L of or¬ 
der m as in Section 2.5, then d^Q is a Calderon-Zygmund dis¬ 
tribution, whenever \a\ < m. This follows immediately from the 
estimate (17) and the characterization of such distributions by the 
Fourier transform given by assertion (ii) of the proposition. 

3.2 The LP theory 

The L p estimates for operators of the form (18) are given by the following 
theorem. 

Theorem 3.2 Let T be the operator T(f) = f * K, with K as in Propo¬ 
sition 3.1. Then T initially defined for f in S extends to a bounded 
operator on L p (R d ) 7 for 1 < p < oc. 

This means that for each p, 1 < p < oo, there is a bound A p so that 

(23) \\Tf\\LP(Rd) < 乂 p||/|Up(Rd) 

for f E S. Thus by Proposition 5.4 in Chapter 1 we see that T has a 
(unique) extension to all of LP that satisfies the bound (23) for / G 
We break the proof into five steps. 

Step 1: L 2 estimate. The case p = 2 follows directly from the fact that 
(T/) A = f A K A ) (see Proposition 1.5) and that 

\\Tf\W = II™ a || l2 < (sup I(01) ||/||l 2 < 川 I/IIL 2 , 

by Plancherel’s theorem. The inequality sup^ |K A (^)| < yl is of course a 
consequence of Proposition 3.1. 

Step 2: A variant of atoms. While our operator T does not in general 
map L 1 to itself (as the example in Section 3.2 of the previous chapter 
already shows), its L p theory for 1 < p < oc is bound up with a “weak- 
type” L 1 estimate, as was the case for the maximal function treated 
in Section 4 of Chapter 2. Here we arrive at this kind of estimate by 
studying the action of T on variants of the atoms that are relevant for 
the Hardy space theory. In the present situation we deal with “1-atoms,” 
the case p = 1 of the p-atoms (specifically excluded from Corollary 5.3 
in the previous chapter!). 

A 1-atom a associated to a ball B is an L 2 function with: 
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(i) a is supported in B, and f |a(x)| dx < 1 . 

(ii) f B a(x) dx = 0 . 

Notice that the L 2 norm of a does not enter into the conditions (i) and 
(ii) above; the requirement that a G L 2 is made only for technical conve¬ 
nience. 

For each ball B we will denote by B* its double, that is, the ball 
with the same center as B but with twice its radius. The key estimate 
involving our operator T and 1 -atoms is that there is a bound A so that 

(24) / |T(a)(x)| dx < A, for all 1 -atoms a. 

J(B*) C 

Now (24) will be a consequence of an inequality satisfied by the function 
k associated to the distribution kernel K of the operator, namely that 
for all r > 0 

(25) / \k(x — y) — k(x) \ dx < A, whenever \y\ < r. 

J \x\>2r 

To see (25), note that by the mean-value theorem, 

\K x -y) - K x )\ < |y|sup |v/c(z)|, 

zEL 


where L is the line segment joining x to x — y. Since \x\ > 2r and \y\ < r, 
it follows that \z\ > |x|/ 2 , whenever z G L. Thus the differential inequal¬ 
ities (20) for \x\ = 1 show that \k(x ~ y) — k(x)\ < c\x\~ d ~ l ^ and (25) 
follows because r J^ >2r | 工 |— d — 1 dx is independent of r (and is finite). 

To deduce (24) from this, observe first that whenever / is in and is 
supported in the ball then for x ^ B* we have 


T if)i x ) = / k{x -y)f{x)dy. 

J B 

This is so because the distribution K agrees with the function k away 
from the origin and here \x — y\ > r. Since k(x — y) is bounded there, a 
passage to the limit shows that the same identity holds if / is supported 
in B and is assumed merely to be in L 2 • So if a is a 1 -atom associated 
to B and x ^ B*, we have 


T(a){x) = / k(x — y)a(y) dy = (k{x - y) - k(x))a(y) dy, 
J B J B 
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because f B a(y) dy = 0. Therefore, 

[\T(a)(x)\dx < (if \k(x ~ y) - /c(x)| dx\ \a(y)\dy, 
Jx^B* JB Ux^B* J 

and (24) is established if we invoke (25) with r the radius of the ball B. 

Step 3: The decomposition. We exploit (24) by decomposing any in- 

tegrable function / as a sum of a “good” function for which the L 2 

theory applies, and an infinite sum of multiples of atoms, for which the 
estimate (24) is used. 

Lemma 3.3 For each f in L 1 (R d ) and a 〉 0 ， we can find an open 
set E a and a decomposition f = g b so that: 

(a) m(E a ) < 

(b) \g(x)\ < ca, for all x. 

(c) E a is a union [J Qk of cubes Qk whose interiors are disjoint. More¬ 
over b — ^2 k bk, with each function bk supported in Qk and 

J \bk(x)\ dx < cam(Qk) ) while bk(x) dx = 0. 

Note that (c) implies that b is supported in E a hence g(x) = f(x) if 

x ♦ E a . Observe also that each is of the form cam(Qk)dk^ where 
is a 1 -atom. 

The proof of the lemma is a simplified version of the argument used to 
prove Proposition 5.1 in the previous chapter; in particular, here we use 
the full maximal function f* instead of the truncated version /t. The 
guiding idea is to try to cut the domain of / into the set when |/(a:)| > a 
and its complement. However, as before, we must be more subtle and in 
the present situation cut / according to where f*(x) > a. Thus we take 
E a = {x : f*(x) > a}. The conclusion (a) is therefore the weak-type 
estimate for f* given in (27) of the previous chapter. 

Next, since E a is open we can write it as |J fc Qk, where the Qk are 
closed cubes with disjoint interiors, with the distance of Qk from 
comparable to the diameter of Qk> (This is Lemma 5.2 of the previous 
chapter.) Now set 

mk= H dx . 
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Thus if Xk is a point of closest to Qk ，one has \rrik\ < cf*(xk) < 
ca. We define g(x) = f(x) for x ^ and g(x) = rrik for x G Qk- As a 
result |/(x)| < ca for x G £^，because f*(x) < ot there. Altogether then 
\g(x)\ < ca, proving conclusion (b). 

Finally, b(x) = f(x) — g(x) is supported in E a — |J fc Qk and hence b = 
bk, where each is supported in Qk and equals f(x) — rrik there. 
Thus 

/ \b k {x)\dx= / |/(a:) - m k \dx< \f(x)\dx + \m k \m(Q k ). 

J J Qk J Qk 

Also as before 

f \f{x)\dx < cm(Q k )f*(x k ) < cam(Q k ), 

J Qk 

hence 

J \b k (x)\dx < cam(Q k ), 

since |mfc| < ca. Clearly, f bk(x) dx = /g fe (f( x ) ~ m k) dx = 0, and so 
the decomposition lemma is proved. 

One observes that if we were also given that / was in L 2 (R d ) 7 then it 
would follow that g, 6, and each 6^ would also be in L 2 (R d ). Since the 
supports of the bk are disjoint, the sum b = Uk would converge not 
only in the obvious pointwise sense, but also in the L 2 norm. 

Step 4: Weak-type estimate. Here we show that 

(26) m({x : \T(f)(x)\ > a}) < —||/||li, for each a 〉 0 

a 

whenever / G L 1 f) L 2 , with the bound A independent of / and a. To do 
this we decompose f = g + b according to the lemma and note that 

m ({x : \T(f)(x)\ > a}) < m{{x : \T(g)(x)\ > a/2}) 

+ m({x: \T(b)(x)\ > a/2}), 

because T(f) = T(g) -h T(b). Now by Tchebychev’s inequality and the 
L 2 estimate for T, 

m({x: \T(g)(x)\ > a/2}) < \\Tg\\ 2 L2 < 

\a J ol^ 
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However f \g(x)\ 2 dx = f EC \g{x)\ 2 dx + f E \g(x)\ 2 dx. Now on we 

^ Cl *^0( 

have g(x) = f(x) and \g(x)\ < ca, so the first integral on the right is 
majorized by ca||/||^i. Also 

f \g(x)\ 2 dx < ca 2 m(E a ) < ca\\f\\ L i, 

JE a 

by conclusion (a) of the lemma. As a result 

: |T(^)(x)| > a/2}) < -||/|| L i. 

a 

To deal with T(b) = ^ fc T(6fc), we let Bk denote the smallest ball that 
contains Qk ： and the double of We define =[J B^.. Now, 
again by Tchebychev’s inequality, for a bounded set 5, 

m({x E S : |T(6)(a:)| > a/2}) < — [ |T(6)(a:)| dx 

a Js 

〈 - 谇 L \T(b k )(x)\dx, 

since T(6) = T(6fc), with convergence in the L 2 norm. 

Now set S = (E^) c fl B, where S is a large ball. Letting the radius of 
B tend to infinity then yields 

m({x 系 E# a •• \T(b)(x)\>a/2})<^ \T(b k )(x)\dx, 

because =[J implies that (E^) c C {B^) c for each k. However 
as we have noted, bk is of the form cam(Qk)dk^ where is a 1-atom 
associated to the ball Bk. Hence the estimate (24) gives 

m({x G {El) c : |T(6)(a:)| > a/2}) < c = cm(E a ) < -||/||li. 

^ a 

k 

Finally, 

m{ED < Y^rn{Bl) = c^m(Q fc ) = cm(E a ) < ^||/|| L i, 
k 

because m(B^) = cm(Qk) for every k. 

Gathering the inequalities for T(g) and T(6) together then shows that 
the weak-type estimate (26) is established. 
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Step 5: The LP inequalities. We now borrow the idea used in Chapter 2 
in the proof of the L p estimates for the maximal function f* in which the 
weak-type inequality is transformed to its more elaborate form, given in 
equation (28) of that chapter. In our case the stronger version is 
(27) 

m({x : \T{f){x)\ > a}) KAf- f \f\dx + \ f \f\ 2 dx 

W|/| >a Q J\f\<oc 

whenever / belongs to both L 1 and L 2 . To prove this, we cut / (this time, 
more simply) into two parts for each a > 0 , according to the size of /. 
Namely, we set / = /i + / 2 where fi(x) = f(x) if \f{x)\ > a, and fi(x)= 
0 otherwise; also f 2 (x ) 二 f(x) if \f{x)\ < a, and f 2 (x) — 0 otherwise. 
Then again 

m({\T(f)(x)\ > a}) < m({|T(/i)(x)| > a/ 2 }) + m({|T(/ 2 )(x)| > a/ 2 }). 



By the weak-type estimate just proved, 

m ({| r (/i) ㈤ I > ot/2}) < -||/i|| L i = - / \f\dx. 

a a J\f\>a 

By the L 2 -boundedness of T and Tchebychev’s inequality 


2 


2 


m ({|r(/ 2 )( 工 )1 > ol/2}) < ( ^ ) ||T(/ 2 )||i 2 


^ LJ l2dx ' 


proving (27). 

Now (see (29) in Chapter 2 ) 

\T(f)(x)\^dx 


•OO 

A(a" p ) da ， 


where X(a) = m{{x : \T(f)(x)\ > a}). Therefore, because of (27)，the 
above integrals are majorized by 


A I / a 


We have 


—i/p 



|/| dx ) da + I a 


•2/p 


\f\>oc l /r> 


l/iyi/p 


\f\ 2 dx da 


-i/p 

0 KJlflya^p 


1 / 


p 


a i I |/| dx } da = I \f\ 


a~ l ^ p da ] dx 


a p / \f\ p dx 
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if > 1 where = p/{p — 1). Also, 

[ oT 2/p ( f \f\ 2 dx\ da = b p [ \f\ p dx 

JO / J 

if < 2, with b v — p/{2 — p). Thus we get 

||T(/)|| LP <A p ||/|| LP , 

with A p — A - p - (^； + 2 ~)- This) takes care of the case 1 < p < 2 (the 
case p = 2 having been settled before). 

To pass to the case 2 < p < oo, we use the duality of L p spaces set 
forth in Section 4 of the first chapter. 

We note that whenever / and g are in S then by Plancherel’s theorem 

[T{f)gdx= ( / fT*{g)dx. 

JRd jRd J]^d 

Here T*(g) = g * where (K*) A = m, with m = K A . Now m satisfies 
the same characterization (22) that m does, and hence the results above 
apply to T*. In particular the identity 


(28) / (Tf)gdx= / f(T^g)dx 

JRd JRd 

extends to / and g in L 2 . 

Next with 2 < p < oc, let q be its dual exponent (1/p + 1/q = 1 )， 
where now 1 < ^ < 2. Then, by Lemma 4.2 in Chapter 1 , 

||T(/)|| LP =sup / T(f)gdx , 

9 J 


where the supremum is taken over all g that are simple with || 夕 ||l <? 幺 1 . 
However 



T(f)gdx 



< \\f\\Lp\\T*(g)\\L<i < A?||/||lp ， 


by H61der’s inequality and the boundedness of T* on L q (1 < ^ < 2 ). 
The result is now (23) for all / G 5, for 1 < p < oo, concluding the proof 
of the theorem. 


We make two closing comments about the theorem just proved. 
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• The result leads to “interior” estimates for solutions of elliptic equa¬ 
tions in terms of L p based Sobolev spaces. As such, these may be 
viewed as a quantitative version of Theorem 2.14. This is outlined 
in Problem 3. 

• The essential properties of K that enter in the proof of the L p 
theorem are, first, the L 2 boundedness via the Fourier transform, 
and second, the use of inequality (25). This inequality has natu¬ 
ral extensions to a variety、of contexts that arise in applications, 
in particular where the underlying structure of R d is replaced by 
another suitable “geometry.” However, obtaining L 2 boundedness 
in other settings is more problematic, since in general the Fourier 
transform may be unavailing. For this, further ideas have been 
developed that use the almost-orthogonality principle in Proposi¬ 
tion 7.4 of Chapter 8, but these will not be pursued here. 


4 Exercises 

1. Suppose F is a distribution on and F = [ with / a C k function in Cl. Show 
that F, taken in the sense of distributions, agrees with d^f for each |a| < k. 

2, The following represent converses to the previous exercise. 

(a) Suppose / and g are continuous functions on (a, 6) C M and 盖 (taken in 
the sense of distributions) agrees with g. Show that for every x E (a, 6), 
(f(x + h) — f(x))/h —> g(x) as /i — 0. 

(b) If / and g are merely assumed to be in L 1 (a, b) with 盖 = g in the sense 
of distributions, then / is absolutely continuous and (f(x -b h) — f(x))/h 
g(x) as /i —> 0 for a.e. x. 

As a result, if / is a continuous but nowhere differentiable function on R, 
then the distribution derivative of / is not a locally integrable function on 
any sub-interval. 

(c) Generalize (a) as follows: Suppose k > 1 is an integer, and that / is a 
continuous function on an open set Q. If for each multi- index a with |a| < /c, 
the distribution f equals a continuous function g a , then / is of class C k 
and d^f = g a as functions, for all |a| < k. 

[Hint: To see (a), let xo G (a, b), h > 0, and let r/ be a test function on (a, 6) so 
that f r) = 1. With J > 0, define ri 6 (x) = S~ l ri(x/6) and 

rx 

W(X ) 二 / T] 5 (x 0 - \-h-y) - T] 5 (xo - y) dy. 


Then f f(x)-^(p(x) dx = — f g{x)<p{x) dx and let (5, /i —> 0. 
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For (b), show as a first step that, up to a constant, f equals the indefinite 
integral of g, almost everywhere. Then use Theorem 3.8 in Chapter 3, Book III, 
about the differentiability almost everywhere of an absolutely continuous function.] 

3. Show that a bounded function / on R d satisfies a Lipschitz condition (also 
known as a Holder condition of exponent 1) 

|/(x) - f(y)\ < C\x - y\, for all x,y E 

if and only if / E L°° and all the first order partial derivatives df /dx\ 
belong to L°° in the sense of distributions. 

[Hint: Let f n = / * where is an approximation to the identity as in Corol¬ 
lary 1.2. Then df n /dxj G L°° uniformly in n.] 

4. Suppose F is a distribution on Q. 

(a) There exist f n € C°°, each of compact support in Q, so that f n — F in the 
sense of distributions. 

(b) If F is supported in the compact set C, then for every e > 0 we can choose 
the f n so that their supports are in the e-neighborhood of C. 


5. Let / be locally integrable on R d . Then the “support” of / in the measure- 
theoretic sense is the set E = {x : f(x) ^ 0}. Note that E is essentially determined 
only modulo sets of measure zero. 

Show that the support of /, as a distribution, is equal to the intersection of all 
closed sets C such that E — C has measure zero. 


6. Assume that is a region in defined by = {x E : Xd > f (x’）}，with 
x = (x f , Xd) E M d_1 x M, and (/? a C 1 function. Suppose / is a function that is 
continuous in Q. and whose first derivatives are also continuous in fi, with / [an = 0. 
Let / be the extension of / to M. d defined by f(x) = f(x) if x E and f(x) = 0 


x iQ.. Then 


磬 ：， taken in the sense of distributions, is the function which is 


in Q, and zero in Q c . (Note that it is not necessarily true that is continuous.) 


[Hint: Show that — f Q dx = f Q for all C°° functions ^ of compact 

support in M. d .] 


7 , Show that the distribution F is tempered if and only if there is an integer N, 
and a constant A，so that for all /? > 1, 

|^(^)| < AR n sup \d^^{x )\, 

0<|a|<N 

for all (/? G P supported in |x| < R. 


8. Suppose F is a homogeneous distribution of degree A. Show that F is tempered. 
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[Hint: Fix rj G V ， rj(x) = 1, for |x| < 1, r/ supported in |x| < 2. Let t]r(x)= 
r](x/R)- Find N so that \rjiF((p)\ < c||(/?||n. Then deduce that |(r/H^)(<^)| < 

C +|a 1 MIn.] ~ 

9. Check that on the real line, f(x) = e x , considered as a distribution, is not 
tempered. 

[Hint: Show that the criterion in Exercise 7 fails for every N.] 

10. Verify that T> is dense in «S. 

[Hint: Fix rj so that r/ = 1 in a neighborhood of the origin. Let rjk{x) = ri(x/k) 
and consider (pk = 

11. Suppose that (pi ， (p 2 G S. 

(a) Verify that fi . 外 belongs to S. 

(b) Using the Fourier transform, prove that fi * 内 G 

(c) Show directly from the definition of convolution that (pi * ^ S. 


12. Prove that if Fi is a distribution of compact support and G «S, then Fi * ip G 
S. 

[Hint: For each TV, there exists a constant cn so that 


II^IU < cn(i + M) N IWk.] 


13. Use the previous exercise to prove that if Fi and F are distributions with F\ 
having compact support and F being tempered then: 

(a) F * Fi is tempered, and; 

(b) (F * Fi) A = (F X A is C°° and slowly increasing.) 


,2 

14. Check that f(x) = ^\x\ is a fundamental solution for on R. 

15. A d-dimensional generalization of the identity for the Heaviside function is 
the identity 



hj, 


with hj(x )= 走 and Ad = 27r d ^ 2 /T(d/2) denotes the area of the unit sphere 
in R d . 


[Hint: When d > 2, write S = ( 是 Cd|x「 d+2 ).] 


16. Consider the complex plane C = R 2 , with z = x iy. 
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(a) Note that the Cauchy-Riemann operator 






is elliptic. 

(b) Show that the locally integrable function is a fundamental solution 

for d^. 


(c) Suppose / is continuous in fi, and dzf = 0 in the sense of distributions. 
Then / is analytic. 


[Hint: For (b), use Theorem 2.9, and note that A = 4dzd z , where d z = 




— i 


昜) .] 


17. Suppose f(z) is a meromorphic function on C C. Prove: 

(a) log |/( 2 ) I is locally integrable. 

(b) △(log|/( 2 )|) taken in the sense of distributions is equal to 2tt ~ 

2tt m’ k 5k. Here the Sj are the delta functions placed at the distinct zeroes 
of /, namely Sj((p) = and the Sk are placed at the poles z k of /; also 

rrij, and m’ k are the respective multiplicities. 

[Hint: ~ log \z\ is a fundamental solution of △.] 


18. Prove that a distribution F is homogeneous of degree A if and only if 

[Hint: For the converse, consider $(a) = F{(p a ) for a > 0, cp G V. Then 伞 (a) is 
C°° for a > 0, and = 合少 (a).] 

19. Prove the following facts about distributions in R. 

(a) Given a distribution F, there exists a distribution F\ so that 

■^ = F. 

(b) Show that F\ is unique modulo an additive constant. 

[Hint: For (a) fix (po E T>, with f = 1, and note that each (p V can be written 
uniquely as p 二尝 + a<po for some ip ET> and a constant a. Then define F\ (p ) 二 
F(^). For (b), use the fact that d/dx is elliptic.] 

20. Show that if Ai,..., Ad are distinct complex exponents and + 

bjX Xj log x) 二 0 for all x > 0, then aj = bj = 0 for all I < j < n. 
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[Hint: Proceed as in the proof of Lemma 2.6, and use the fact that dx 

is equal to log R if fij — 0 and that this integral is 0(1) if fij is real and / 0.] 

21. Let F(x, t) = 7it(x), for 亡 > 0, and F(x, t) — 0, when t < 0, as in Theo¬ 
rem 2.10. Prove directly that 


H^r) = 


47r 2 |^| 2 + 2mT 


where (^, r) E x R, with ^ dual to x, and r dual to t. 
[Hint: Use the two identities 


■4TT 2 \^\ 2 t^-27viTt ^ 


e e 


o 


47r 2 |^| 2 + 27rir 


for |^| > 0 


and 


R d 


Ht(x)e~ 27rtx ^ dx = e _47r for t > 0.] 


22. Suppose / is a locally integrable function defined on R, and let u be the 
function defined by u(x,t) = f(x — t), for (x } t) E R 2 . Verify that u, taken as a 
distribution, satisfies the wave equation 

d 2 u _ d 2 u 
dx 2 dt 2 

More generally, let F be any distribution on R. Construct U (in analogy to f(x — 
t)) as follows. If ip is in D(R 2 ), R 2 = {(x,i)}, set U{^>) = / R (F * c/?(x, •))( x ) ^x. 
Then U satisfies 

d 2 U — d 2 U 
dx 2 dt 2 

Note that U is invariant under the translations (/i, /i), for /i E M. 


23. Show that in R 3 the function 


F(x)= 


4tt\x 


1 e- 




is a fundamental solution of the operator A — I. The function F is the “Yukawa 
potential” in the theory of elementary particles. In contrast to the “Newtonian 
potential” —l/(47r|x|), the fundamental solution of A, the function F has a very 
rapid decay at infinity and it thus accounts for the short-range forces in the theory. 

[Hint: Let F be the inverse Fourier transform of — (1 + 4 丌 2 |(| 2 ) 一 1 . Going to polar 
coordinates in R 3 , one then uses the identity 



e 27 ^ 1 da{i)= 


2 sin(27r|x|) 

kl 
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together with the Fourier transform of the conjugate Poisson kernel, given by (18) 
of the previous chapter.] 

24. The following statements deal with the uniqueness of the fundamental solu¬ 
tions of the Laplacian. 

(a) Up to an additive constant, the unique fundamental solutions of A in R d , 
d>2, that are rotationally invariant, are the ones given in Theorems 2.8 
and 2.9. 

(b) The unique fundamental solution of A in R d , d > 3, that vanishes at infinity 
is the one given in Theorem 2.8. 


25. A distribution F defined on Q C M is positive if F((p) > 0 for all GT> 
supported in Q, with (/? > 0. Show that F is positive if and only if F((p) = J ip d(ji 
for some Borel measure dji on Q, that is finite on compact subsets. 


26. Recall that a real-valued function on (a, b) is convex if /(xo(l — 亡 ) + x\t) < 
(1 — t)f(xo) + tf(x\)^ for xo,xi € (a,6), 0 < i < 1. (See also Problem 4 in Chap¬ 
ter 3, Book III.) A function / on Q C M d is convex if the restriction of / to any 
line segment in Q, is convex. 

(a) Suppose / is continuous on (a, b). Then it is convex if and only if the 
distribution is positive. 

(b) If / is continuous on Q, C M d , it is convex if and only if for each f = 

(6, … ，匕 ） C the distribution is positive. 

[Hint: For (a), let (/? € P, (/? > 0, / cpdx = 1 and set c/? e (x) = e~ l (p(x/e). Consider 
/e — / * 

27. Every distribution F of compact support in R d is of finite order in the 
following sense: for each such F\ there exists an integer M and continuous functions 
F a of compact support, so that 


F= d - F ^ 

|a|<M 

Moreover if F is supported in C, then for every e > 0 we may take F a to be 
supported in an e-neighborhood of C. Prove this by carrying out the following 
three steps. 

(a) Pick N so that |F(<^)| < c||c^|| ； v，for all € «S, and choose Mq so that 2M。> 
d-\- N . Let Q be the inverse Fourier transform of 1/(1 + 47r 2 |^| 2 ) M °, and 
observe that Q is a fundamental solution of (1 — A) M °, and Q is of class 
C N . 
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(b) For each e, construct Q e corresponding to Q, so that (1 — A) M °Q e = <5 + r e , 
where Q e is supported in the e-neighborhood of the origin (as in Corol¬ 
lary 2.13). Prove that F * Q e is a continuous function, using the fact that 
剛 IScIMU. 

(c) Hence F = (l — A) M ° (Q e * F) — F * r e ^ and the result is proved with M = 
2 A/o- 

28. One can characterize tempered distributions F whose Fourier transforms have 
compact support. 

We already know by Proposition 1.6 that such an F must in fact be a function 
f that is C°° and slowly increasing. A precise characterization when d = 1 is given 
in the statement below. 

The Fourier transform of a tempered distribution F is supported in the interval 
[—M, M] if and only F equals a function / that is C°°, slowly increasing, and having 
an analytic extension to the complex plane as an entire function of exponential type 
27rM; that is, for every e > 0, |/(z)| < A e e 27 r ( M+e )l z l ， where z — x + iy. 

(An analogous assertion holds in higher dimensions.) 

A 

[Hint: Assume F is supported in [—M, M]. Using Exercise 27 allows us to write 
F = 8S(9ot ) ， where g a are continuous and supported in [-M — e, M + e], 

A 

and thus reduce to the case when F is a continuous function. 

To prove the converse, consider fs = f^fs where 75 (x) = j f ^ 

with 7] G C°°, supported in |^| < 1 and such that f r] = l. Then 75 (z) is of expo¬ 
nential type 2ttS and is rapidly decreasing on the real axis. Thus apply the simpler 
version of the result given in Theorem 3.3, Chapter 4 in Book II to the function 
/ < 5 , and let S 0 .] 

29. In this exercise, we consider the L 2 Sobolev spaces. 

The space consists of the functions / E L 2 (R d ) whose derivatives f taken 
in the sense of distributions, are in L 2 (M d ) for all |a| < m. This space is sometimes 
denoted by Note that this is the special case for p = 2 oi the Sobolev 

space given as an example in Section 3 of Chapter 1. However, here we use a 
slightly different (but equivalent) norm, which makes into a Hilbert space. 

On we define the inner product 

(f ， 9)m= ^2 ( d xf^ d x9) 0 ^ 

\oi\<rrt 

with (f ， g、o = f Rd f(x)g(x) dx. Then, with the norm ||/||乂 = (/,/)m 2 is a 
Hilbert space. 

(a) Verify that / E if and only if /(^)(1 + |^|) m E L 2 , and that the norms 
H/ll^ and ||/( 0(1 + |^l) m ||L 2 are equivalent. 

(b) If m > d/ 2 , then / can be corrected on a set of measure zero, so that / 
becomes continuous and is in fact in C k 、 for k < m — d/2. This is a version 
of the Sobolev embedding theorem. 
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[Hint: Cf € L l (R d ) if |a| < m - d/2.} 

30. The following observation is useful in connection with the L 2 theory of 
Calderon-Zygmund distributions on R d . 


(a) The Fourier transform of the distribution 
Cl ^ 0. 



equals ci log |^| + C 2 , with 


(b) Prove the following consequence of (a). Suppose /c is a homogeneous function 
of degree —d that is C°° away from the origin and with 

/ k(x) da(x) ^ 0. 

J \x\ = l 

If K is any distribution that agrees with k away from the origin, then the 
Fourier transform of K is not a bounded function. Another way of stating 
this is that the operator T, defined by T(ip) = K * (p initially defined for 
(p E ： T>, does not extend to a bounded operator on L 2 (M. d ). 


31. Suppose /c is a C°° function homogenenous of degree —d, not identically equal 
to zero, and 

/ k(x) da(x) = 0. 

J\x\=l 

If K is the principal value distribution defined by /c, that is, K = pv(k), then K is 
a Calderon-Zygmund distribution but the operator T given by Tf = f * K is not 
bounded on L 1 or L°°. 

The special case of the Hilbert transform is in Exercise 7, Chapter 2. 

[Hint: li (p G 1?, then Tcp(x) = ck(x) + 0(|x「 d — 丄 ） as |x| —>• oc ， where c : = JV] 


32. The cancelation condition (21) for the Calderon-Zygmund distributions for 
some n > 1 implies the condition for n = 1. Show this by first proving the following 
fact: Whenever K satisfies (20) and (21) for some n > 1, then for every 1 < j < d, 
the distribution Xj - K equals the locally integrable function Xjk. 

[Hint: The distribution XjK — Xjk is supported at the origin. Then use The¬ 
orem 1.7 to test XjK — Xjk against as r 一 0 for suitable to conclude that 
this difference vanishes. Next, write any C*- 1 ^-normalized bump function as (p(x)= 
rj{x) 4 - Xjipj(x) where rj and the (pj are multiples of C( n ) and C*( 0 )-normalized 

bump functions respectively, and use the above fact.] 


33. Suppose k is a C°° function in R d — {0}, that satisfies the differential inequal¬ 
ities (20). Then there is a Calderon-Zygmund distribution K which has k as its 


associated function if and only if sup 0<a<6 


/a<|x|<6 fc ( a： ) dX < °°- 


[Hint ： In one direction, note that \K{rfb — rf a )\ < 2^4, where rj{x) = 1 if |x| < 1/2, 
and rj(x) = 0 if |x| > 1, with rj G C°°. In the other direction, define 


K(^p) = / k(x)(<p(x) — ip(0)) dxI k(x)(p(x) dx 
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and verify that the conditions (20) and (21) hold for K.] 

34. Suppose // is a Calderon-Zygmund distribution and rj belongs to S. Verify 
that r]K is a Calderon-Zygmund distribution. 


5 Problems 

1 . We consider periodic distributions and their Fourier series. 

(a) The notion of a periodic distribution on M d can be defined in two equivalent 
ways: 

First, one can consider distributions F on R d which are periodic in the sense 
that Th{F) = F for all h G 7^ d ] 

Alternatively, one can consider the continuous linear functionals on T>(T d ), 
the space of C°° periodic functions on M d . (Here T d = R d /Z d denotes the 
d-dimensional torus.) 

(b) Note that if (/? E D(T rf ), then ip has a Fourier series expansion 

{ \ \ ' 27rin*x 

(y9(X) 二 〉』 dri^ 3 

n 

where the Fourier coefficients a n = f Td f(x)e~ 27rtn ' x dx are rapidly decreas¬ 
ing, that is, for every N > 0, \a n \ < 0(|n| _N ) as \n\ oo. 

Similarly, if F is a periodic distribution, and a n = F(e -27rmi ) denote its 
Fourier coefficients, then a n are slowly increasing in the sense that for some 
N > 0, \a n \ < 0(|n| N ) as |n| ^ oo. 

Moreover, the Fourier series a n e 27rmi converges to F in the sense of dis¬ 
tributions. 

[Hint: To prove the equivalence in (a), consider the “periodization” operator P : 
D(M d ) D(T d ), 

P((f){x) = ^ T h {(f){x)= E — /l). 

h£Z d h£Z d 

Then find 7 E T)(R d ) so that Pij) = 1. This allows to prove that P is surjective, 
and that, in the same way, its dual P* :T >2 T>\ is also surjective. (Here T>* 
and T >2 denote, respectively, the two spaces of distributions described in (a).) To 
construct 7, pick ip G D(R d ) so that ip > 0 and ^ = 1 on {0 < Xj < 1, 1 <3 < <i}, 
and let 7 : = 

2. Suppose Tf = f * K is a singular integral operator as in Theorem 3.2 of Sec¬ 
tion 3. Then the mapping / •—> T(f) is bounded on the Hardy space Hj;, and in 
particular maps Hj to L 1 . 
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[Hint: Consider first a 2-atom a associated to the unit ball B. Then for an appropri¬ 
ate constant c (bounded independently of a) we have T(a) = c(a* + ^>). Here a* is 
a 2-atom for the ball B a st^ the double of 5, and 伞 satisfies | 少 (x)| < (1 -f |x|) -d-1 , 
f Rd ^(x) dx = 0. With this apply Exercise 21 in Chapter 2. Then obtain the analog 
for 2-atoms a, after rescaling and translation.] 


3. Prove the following interior estimates for an elliptic operator L of order m with 
constant coefficients. 

Suppose O and G\ are bounded subsets of R with O C G\. Assume u and / 
are L p functions in 0\ with Lu = f m G\ in the sense of distributions. Then if 
1 < p < oc and is a non-negative integer, we have 

W^xU\\ L P(0) - C ( ^2 ll^/IU p (^l) + II^IIlp(Oi) 

|o ： |<m+fc \|/3|<fc 



where the derivatives are taken in the sense of distributions. 

[Hint: Consider the parametrix Q e = r) e Q given in Corollary 2.13 which is sup¬ 
ported in |x| < e. Here e is chosen so that O e C Oi, where O e are the points of 
distance < e from O. 

Set U = ♦a, with ^ a C°° function that is 1 near O e but vanishes outside G\. 
Then 


L{U) = ^L(u) + 

\^\<m 


and what is important is that the vanishes in O e . Now U r e * U = Q e * L(f/), 
where r e G S. This gives 

^>u = Q e * (ipf) - r £ (^>u) + * {^dlu). 


As has been pointed out, d^Q are Calderon-Zygmund distributions whenever |7| < 
m, so the same is true for Q e . Then using Theorem 3.2, the result follows.] 


4.* Let P(x) be any real polynomial in R d , and k a homogeneous function of 
degree —d with J^| =1 fc(x) da(x) = 0. 


(a) One can define the tempered distribution pv 


(e iP(I) it(x)) 


= Kby 


K(ip) = lim 

e—►O 


I 工 n 


e lP ^ k(x)(f(x) dx. 


(b) Then the Fourier transform of X is a bounded function (with bound inde¬ 
pendent of the coefficients of P). 


5.* Let Q be a fixed real-valued polynomial on R d . Consider the distributions 
initially defined for Re(s) > 0 by 


I(s)(ip) = f |Q(x)|V(x) dx, 

J Q(x)>0 


where ip E S. 
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Then /(s)((/?) has a meromorphic continuation to the whole complex s-plane, with 
poles at most at 5 = —/c/m, where m is a positive integer determined by Q, and k 
is any positive integer. The order of the poles do not exceed d. 

6 .* As a consequence of the results in Problem 5*, one may prove the following. 

(a) Suppose L = JZ| a |< m o, a d^ is a non-zero partial differential operator on R d 
with a a complex constants. Then L has a tempered fundamental solution. 
As an immediate corollary we also have: 

(b) Suppose P is a complex-valued polynomial on R d . Then there exists a 
tempered distribution F that agrees with 1/P where P(x) ^ 0. 

In fact, let P be the characteristic polynomial of L and apply the result of the 
previous problem to Q = |P| 2 . Suppose I(s) has a pole of order r at s = 1 ， then 
define the tempered distribution F by 

Consequently, PF = 1, and the inverse Fourier transform of F is the desired fun¬ 
damental solution of L. 


7* Suppose L = Yh\ ot \< rn a a^ is a partial differential operator on R d , with a a 
complex constants. Then L is hypo-elliptic if and only if for each a ^ 0 


p(o 


as 卜 oo, 


where P denotes the characteristic polynomial of L. 


S* We describe several fundamental solutions of the wave operator 


□= 


d 2 


— A x , 


where (x, i) E x M and 

3 

We let r+ be the forward open cone = {(x,t) : 亡〉 I 工 I}，and r — — r+ ， the 
backward cone. For each s with Ke(s) > — 1 we define the function F s by 


(29) 


F s (x,t) 


a s (t 2 - |x| 2 ) s/2 , 

0 


if (x, i) E T + 
otherwise. 


Here a^ 1 = ( 3+ ^ +1 ) r (s/2 + 1). Then s i—► F s has an analytic continu¬ 

ation in the complex s plane as an entire (tempered) distribution-valued function. 
Moreover, one can prove that F+ = F s \ s =-d+i is a fundamental solution of □. 

Note that F -, obtained from F+ by mapping t >—► — 亡 ， is also a fundamental 
solution, and F + and F- are supported in r+ and r_ respectively. In addition, 
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if d is odd and d > 3, then a s vanishes for s = —d + 1, so both F+ and F_ are 
supported on the boundary of their cones, which is a reflection of the Huygens’ 
principle. 

Finally, a third fundamental solution Fq of interest is given by 


及 !?>0 47T 2 (j€| 2 — r 2 + ie 


with the limit taken in the sense of distribution, and (^, r) representing the dual 
variables to (x, t). The fundamental solutions F+, and Fo are each homo¬ 
geneous of degree —2, and invariant under the Lorentz group of linear transfor¬ 
mations of determinant 1 that preserves r+. Also each fundamental solution of 
□ with these invariance properties can be written as ciF + + C 2 -F_ + C 3 F 0 , with 

Cl + C2 + C3 = 1 . 



Applications of the Baire 
Category Theorem 


We see the profound difference that lies between sets 
of the two categories; this difference lies not within 
denumerability, nor within density, since a set of the 
first category can have the power of the continuum and 
can also be dense in any interval one considers; but it 
is in some sense a combination of these two preceding 
notions. 

R. Baire, 1899 


In the late nineteenth century, Baire introduced in his doctoral disser¬ 
tation a notion of size for subsets of the real line which has since provided 
many fascinating results. In fact, his careful study of functions led him to 
the definition of the first and second category of sets. Roughly speaking, 
sets of the first category are “small，” while sets of the second category 
are “large.” In this sense the complement of a set of the first category is 
“generic.” 

Over time the Baire category theorem has been applied to metric 
spaces in different and more abstract settings. Its noteworthy use has 
been to show that a number of phenomena in analysis, found first in 
specific counter-examples, are in fact generic occurrences. 

This chapter is organized as follows. We begin by stating and proving 
the Baire category theorem, and then proceed with the presentation of 
a variety of interesting applications. We start with the result about 
continuous functions which Baire proved in his thesis: a pointwise limit 
of continuous functions has itself “many” points of continuity. Also, 
we shall prove the existence of a continuous but nowhere differentiable 

function、as well as the existence of a continuous function with Fourier 

/ 

series diverging at a point, by showing that the category theorem allows 
us to see that such functions are indeed generic. We also deduce from 
Baire’s theorem two further general results, the open mapping and closed 
graph theorems, and provide in each case an example of their use. Finally, 
we apply the category theorem to show that a Besicovitch-Kakeya set is 
generic in a natural class of subsets of M 2 . 
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1 The Baire category theorem 

Although Baire proved his theorem on the real line, his result actually 
holds in the more general setting of complete metric spaces. For the 
purpose of the applications we have in mind it is better to have access to 
this more general formulation right away. Fortunately, the proof of the 
theorem remains very simple and elegant. 

To state the main result, we begin with a list of definitions. Let X be 
a metric space with metric d, carrying the natural topology induced by 
d. In other words, a set O in X is open if for every x E O there exists 
r > 0 so that B r (x) C O, where B r (x) denotes the open ball centered at 
x and of radius r, 


B r (x) = {y e X : d(x,y) < r}. 

By definition, a set is closed if its complement is open. 

We define the interior E° of a set E 1 C X to be the union of all open 
sets contained in E. Also, the closure E of E is the intersection of all 
closed sets containing E. Since one checks easily that the union of any 
collection of open sets is open, and the intersection of any collection of 
closed sets is closed, we see that E° is the “largest” open set contained 
in E, and E is the “smallest” closed set containing E. 

Suppose 五 is a subset of X. We say that the set E is dense in X if 
E — X. Also, the set E is nowhere dense if the interior of its closure is 
empty, (E)° — 0. For instance, any point in R d is nowhere dense in R d . 
Also, the Cantor set is nowhere dense in R, but the rationals Q are not 
since Q 二 R. We note here that in general E is closed and nowhere dense 
if and only O = E c is open and dense. 

We now describe the central notion of category due to Baire, and the 
dichotomy it introduces. 

• A set 五 C X is of the first category in X if 五 is a countable union 
of nowhere dense sets in X. A set of the first category is sometimes 
said to be “meager.” A set E that is not of the first category in X 
is referred to as being of the second category in X. 

• A set E d X \s defined to be generic if its complement is of the 
first category. 

Thus the idea of category is to describe “smallness” in purely topological 
terms (involving closures, interiors, etc.) It reflects the idea that elements 
of a set of the first category are to be thought of as “exceptional,” while 
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those of a generic set are to be considered “typical.” Connected with this 
is the fact that a countable union of sets of the first category is of the 
first category, while the countable intersection of generic sets is a generic 
set. Also we record here the useful fact that any open dense set is generic 
(this follows from our remark earlier). 

In general relying on one’s intuition about the category of sets requires 
a little caution. For instance, there is no link between this notion and 
that of Lebesgue measure. Indeed, there are sets in [0,1] of the first 
category that are of full measure, and hence uncountable and dense. By 
the same token, there are generic sets of measure zero. (Some examples 
are discussed in Exercise 1.) 

The main result of Baire is that “the continuum is of the second cate¬ 
gory.The key ingredient used in his argument is the fact that the real 
line is complete. This is the main reason why his theorem immediately 
carries over to the case of a complete metric space. 

Theorem 1.1 Every complete metric space X is of the second category 
in itself，that is，X cannot be written as the countable union of nowhere 
dense sets. 

Corollary 1.2 In a complete metric space, a generic set is dense. 

Proof of the theorem. We argue by contradiction, and assume that X 
is a countable union of nowhere dense sets F n , 

OO 

⑴ ^ = U 

n=l 

By replacing each F n by its closure, we may assume that each F n is 
closed. It now suffices to find a point x E X with x ^ |J F n . 

Since F\ is closed and nowhere dense, hence not all of X, there exists an 
open ball Si of some radius ri > 0 whose closure Si is entirely contained 
in Ff. 

Since is closed and nowhere dense, the ball B\ cannot be entirely 
contained in F2, otherwise F2 would have a non-empty interior. Since F2 
is also closed, there exists a ball B2 of some radius 7*2 > 0 whose closure 
B2 is contained in B\ and also in . Clearly, we may choose 7*2 so that 
r2 < n/2. 

Continuing in this fashion, we obtain a sequence of balls {B n } with 
the following properties: 

(i) The radius of B n tends to 0 as n —> 00. 
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(ii) B n+ \ C B n . 

(iii) F n H B n is empty. 

Choose any point x n in B n . Then, {xn}^^ is a Cauchy sequence be¬ 
cause of properties (i) and (ii) above. Since X is complete, this sequence 
converges to a limit which we denote by x. By (ii) we see that x G B n 
for each n, and hence x 丰 F n for all n by (iii). This contradicts (1)，and 
the proof of the Baire category theorem is complete. 

To prove the corollary, we argue by contradiction and assume that 
E C X is generic but not dense. Then there exists a closed ball B entirely 
contained in E c . Since E is generic we can write E c = IJ^Li F n where 
each F n is nowhere dense, hence 


\J(F n nB). 

n=l 

It is clear that F n D B is nowhere dense, hence the above contradicts 
Theorem 1.1 applied to the complete metric space B, and the corollary 
is proved. 

The theorem actually extends to certain cases of metric spaces that are 
not complete, in particular to open subsets of a complete metric space. 
To be precise, suppose we are given a subset Xq of a complete metric 
space X. Then Xq is itself a metric space, inheriting its metric from X 
by restricting the metric on X to Xo, The fact is that if Xo is an open 
subset of X, then the conclusion of the theorem holds for it; that is, Xq 
cannot be written as a countable union of sets that are nowhere dense 
(in Xq). See Exercise 3. A simple example is given by the open interval 
(0,1) with the usual metric. 

1.1 Continuity of the limit of a sequence of continuous functions 

Suppose X is a complete metric space, {/ n } is a sequence of continuous 
complex-valued functions on X, and that the limit 

lim f n (x) = f(x) 

n—^oo 

exists for each x E X. It is well known that if the limit is uniform in x, 
then the limiting function / is also continuous. In general, when the limit 
is just pointwise, we may ask: must / have at least one point of conti¬ 
nuity? We answer this question affirmatively with a simple application 
of the category theorem. 
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Theorem 1.3 Suppose that {f n } is a sequence of continuous complex¬ 
valued functions on a complete metric space X, and 

lim f n (x) = f(x) 

n—►oo 

exists for every x ^ X. Then, the set of points where f is continuous is 
a generic set in X. In other words, the set of points where f is discon¬ 
tinuous is of the first category. 

Therefore / is in fact continuous at “most” points of X. 

To show that the set P of discontinuities of / is of the first category, 
we use a characterization of points of continuity of / in terms of its 
oscillations. More precisely, we define the oscillation of the function / 
at a point x by 

osc(/)(x) = lim o;(/)(r,x), where uj{f)(r,x) = sup y ^ zeBr{x) \f(y) - /(z)|. 

The limit exists since the quantity cj(/)(r, x) decreases with r. In par¬ 
ticular, we see that osc(f)(x) < e if there exists a ball B centered at x 
so that \ f(y) - f(z)\ < e whenever y^z e B. Two more observations are 
in order: 

(i) osc(f)(x) = : 0 if and only if / is continuous at x. 

(ii) The set E e = {x e X : osc(f)(x) < e} is open. 

Property (i) follows immediately from the definition of continuity. For (ii), 
we note thflit if x G E 。there is an 7* 〉 0 so that sup y ^ GjBr(x) \f(y) - 
f{z)\ < e. Consequently, if x* G S r / 2 (^) ? then x* G E e because 

sup \f(y)-f(z)\< sup \f{y)~ f(z)\ < e. 

y,zeB r/ 2 (x*) y,zeB r (x) 

Lemma 1.4 Suppose {f n } is a sequence of continuous functions on a 
complete metric space X, and f n (x) — f(x) for each x as n oo. Then ， 
given an open ball B G X and e > 0 ， there exists an open ball Bo C B 
and an integer m > 1 so that \fm(x) - f(x)\ < e for all x G Bo. 

Proof. Let Y denote a closed ball contained in B. Note that Y is 
itself a complete metric space. Define 

E £ = {x eY : sup \f 3 (x) - fk{x)\ < e}. 

j,k>e 
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Then, since f n (x) converges for every a: G X, we must have 

OO 

(2) Y=[jE £ , 

e=\ 

Moreover, each is closed since it is the intersection of sets of the 
type {x \ \fj(x) — fk{x)\ < e} which are closed by the continuity of 
fj and fk. Therefore, by Theorem 1.1 applied to the complete metric 
space y, some set in the union (2), say E m , must contain an open ball 
Bo. By construction, 

sup \fj(x) — fk(^)\ < e whenever x G So 5 

j,k>m 

and letting k tend to infinity we find that \f m (x) — f{x)\ < e for all x G 
Bo. This proves the lemma. 

To finish the proof of Theorem 1.3, we define 

F n = {x e X : osc(f)(x) > 1/n }， 

in other words, F n = E: with e = 1/n in the notation of (ii) above. 
Then, by our observation (i), we have 

OO 

V=[jF n , 

n — \ 

where we recall that V is the set of discontinuities of /. The theorem 
will be proved if we can show that each F n is nowhere dense. 

Fix n > 1. Since F n is closed, we must show that it has empty interior. 
Assume on the contrary, that B is an open ball with B C F n . Then, if 
we set e = l/4n in the lemma, we find that there is an open ball So C S, 
and an integer m > 1 so that 

(3) \fm(x) - f(x)\ < l/4n, for all x e B 0 . 

By the continuity of / m , we may find a ball B f C so that 
⑷ l/m(2/) — fm{z)\ < l/4n, for all y,z e B f . 

Then, the triangle inequality implies 

\f(y) - f(^)\ < \f(y) - fm(y)\ + \fm(y) - /m ⑷ | + \fm(z) - f(z)\. 



1. The Baire category theorem 


163 


If y, z G B\ the first and third terms are bounded by l/4n because of 
condition (3). The middle term is also bounded by 1/An due to (4). 
Therefore 


3 1 

\f(y) - / ⑷ I < — < - whenever y,z G B f . 

Consequently, if x f denotes the center of B f , we have osc(/)(x / ) < 1/n 
which contradicts the fact that x f E F n . This concludes the proof of the 
theorem. 


1.2 Continuous functions that are nowhere differentiable 

Our next application of the category theorem is to the problem of the 
existence of a continuous function that is nowhere differentiable. 

Our first answer to this question appeared in Chapter 4 of Book I where 
we showed that the comp lex-valued function / given by the following 
lacunary Fourier series 


f(x) — 2~ na e l2Tlx with 0 < a < 1 

n 二 0 

is continuous but nowhere differentiable. Moreover, a slight change in 
the proof shows that both the real and imaginary parts of / are also 
nowhere differentiable. Other examples arose in Chapter 7 of Book III, 
in the context of the von Koch and space-filling curves. 

Here, we prove the existence of such functions by showing that they 
are generic in an appropriate complete metric space. The space we have 
in mind consists of all real-valued continuous functions on [0,1], which 
we denote by 

X = C([0 ， 1]). 

This vector space is equipped with the sup-norm 

ll/ll = sup |/(x)|. 

xG[0,l] 

Together with this norm, C([0,1]) is a complete normed vector space (a 
Banach space). The completeness follows because the uniform limit of a 
sequence of continuous functions is necessarily continuous. Finally, the 
metric d on X is chosen to be d{f^g) = ||/ — 夕 ||, and hence (X, d) is a 
complete metric space. 


Theorem 1.5 The set of functions in C([0,1]) that are nowhere differ¬ 
entiable is generic. 
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We must show that the set of continuous functions in [0,1] that are 
differentiable at least at one point, is of the first category. To this end, 
we let En denote the set of all continuous functions so that there exists 
0<x* <1 with 

(5) \f{ x ) — /(^*)| ^ — x*|, for all x G [0,1]. 

These sets are related to V by the inclusion 

OC 

V C (J E n . 

N=1 

To prove the theorem it suffices to show that for each N, the set Ej^ is 
nowhere dense. This will be achieved by showing successively: 

(i) E]si is a closed set. 

(ii) the interior of is empty. 

Thus |J En is of the first category, hence so is the set V. 

Proof of property (i) 

Suppose that {/ n } is a sequence of functions in En so that ||/ n — /|| -^ 
0. We must show that / G En . Let x* be a point in [0,1] for which (5) 
holds with / replaced by f n . We may choose a subsequence {x* fc } that 
converges to a limit in [0,1], which we denote by x*. Then, 

|/(X) — /(x*)| < |/(x) — fn k ( x )\ + \fn k (x) - fn k (x*)\ + \fn k ( X l - /( 工 *)1. 

On the one hand, since ||/n — /|| — 0， we see that given e 〉 0， there 
exists K > 0 so that whenever k > K the first and third terms together 
are < e. On the other hand, we may estimate the middle term by 

fn k {x) - fn k {x*)\ < |/n» _ /n fc «J| + |/n fc «J - fn k {x*)\. 

Therefore, applying the fact that f nk G En twice yields 

fn k {x) ~ fn k (x*)\ < N\x~X* nk \ + 7V|< fc - X* . 

Putting all these estimates together, we obtain 

1/( 工 ) _ /(m < e H- — + ^\^n k ~ x * 

for all k 〉 ■ftr. Xjetting /c tend to infinity，and recalling tliat cc 几 & ^ oc 'we 

get 

\f(x)-f(x*)\<e^N\x-x*\. 
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Since e is arbitrary, we conclude that / G En, and (i) is proved. 

Proof of property (ii) 

To show that En has no interior, let V denote the subspace of C([0,1]) 
that consists of all continuous piecewise-linear functions. Also, for each 
M > 0, let Vm C V denote the set of all continuous piecewise-linear func¬ 
tions, each of whose line segments have slopes either > M or < —M. 
Functions in Vm are naturally called “zig-zag” functions. Note the key 
fact that Vm is disjoint from En if M > N. 

Lemma 1.6 For every M > 0 ， the set Vm of zig-zag functions is dense 
in C([0,1]). 

Proof. It is plain that given e > 0 and a continuous function /, 
there exists a function g EV so that ||/ — 夕 || < e. Indeed, since / is 
continuous on the compact set [0,1] it must be uniformly continuous, 
and there exists 5 > 0 so that |/(a:) — f(y)\ < e whenever \x — y\ < 5. If 
we choose n so large that 1/n < 5, and define ^ as a linear function on 
each interval [/c/n, (k + l)/n] for /c = 0,..., n — 1 with g{k/n) : =f[k/n )， 
g{{k + l)/n )= : f((k + l)/n), we see at once that ||/ — ^|| < e. 

It now suffices to see how to approximate g on [0,1] by zig-zag functions 
in Vm. Indeed, if g is given by g(x) = ax b for 0 < x < 1/n, consider 
the two segments 

iPe(x) = g(x) + e and ^ e (x) = g(x) - e. 

Then, beginning at 夕 (0)，we travel on a line segment of slope +M until 
we intersect (p € . Then, we reverse direction and travel on a line segment 
of slope —M until we intersect (see Figure 1). 

We obtain h G Vm so that 

< h(x) < (^ € (a:), for all 0 < a: < 1/n, 

and therefore \h(x) — g{x)\ < e in [0,1/n]. 

Then, we begin at h(l/n) and repeat this argument on the interval 
[1/n, 2/n]. Continuing in this fashion, we obtain a function h G Vm with 
\h — ^|| < e. Hence ||/ — h\\ < 2e, and the lemma is proved. 

We deduce at once from this lemma that En has no interior points. 
Indeed, given any / G Ej^ and e > 0, we first choose a fixed M > N. 
Then, there exists h G Vm so that ||/ — h\\ < e, and moreover h 丰 En 
since M > N• Therefore, no open ball around / is entirely contained 
in En ， which is the desired conclusion. Theorem 1.5 is proved. 
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2 The uniform boundedness principle 

Next, we turn to another corollary of Baire’s theorem, one that itself has 
many applications. The main conclusion we find is that if a sequence of 
continuous linear functionals is pointwise bounded on a “large” set, then 
this sequence must in fact be bounded. 

Theorem 2.1 Suppose that B is a Banach space, and is a collection 
of continuous linear functionals on B. 

(i) If s\ip eeC |^(/) I < oc for each f 6 B, then 

sup I ⑷ I < DO. 
eec 

(ii) This conclusion also holds if we only assume that sup^ GjC |^(/)| < cx) 
for all f in some set of the second category. 

We note that the collection C need not be countable. 

Proof. It suffices to show (ii) since by Baire’s theorem, B is of the 
second category. So suppose that sup^ GjC |^(/)| < oo for all f £ E, where 
E is of the second category. 

For each positive integer M, define 

E M = {f ^ B ： sup \e{f)\ < M}. 

eec 
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Then, the hypothesis in the theorem guarantees that 


五 =U E M - 
M=1 

Moreover, each is closed, since it can be written as an intersection 
Em = Oeec where E M 尸 {f •• K(/)| < M) is closed by the con¬ 

tinuity of £. Since E is of the second category, some Em must have 
non-empty interior, say when M = Mo. In other words, there exists 
fo G B, and r > 0 so that B r (fo) C Hence for all ^ G £ we have 


1^(/)I < M 0 whenever ||/ - /o|| < r. 
As a result, for all ||^|| < r, and all ^ G £ we have 

||^(^)|| < M(g + /o)ll + IK(-/o)l| < 2 Mq, 


and this implies the conclusion (ii) in the theorem. 


2.1 Divergence of Fourier series 

We now consider the problem of the existence of a continuous function 
whose Fourier series diverges at a point. 

In Book I we gave an explicit construction of a function with this 
property. The main idea there was to break the symmetry inherent in 
the Fourier series E| n |# 0 e xnx jn of the sawtooth function. 

The solution we present here, which relies on a simple application 
of the uniform boundedness principle, provides only the existence of a 
continuous function with diverging Fourier series. However, we also learn 
that, in fact, a generic set of continuous functions have this property. 

Let B = C([—7r, 7r]) be the Banach space of continuous comp lex-valued 
functions on [—7r,7r] with the usual sup-norm ||/|| = sup xe [_ 7rj7 ，-] |/(a:)|. 
The Fourier coefficients of / G S are defined by 


a n = /(72)= 


2丌 



f(x)e~ lTlx dx, 


for all n G Z, 


and the Fourier series of / is 

OO 

/⑷ 〜 E a n e^ 
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Also, the N th partial sum of this Fourier series is defined by 

N 

知 ⑺㈤ 二 E 〜严 . 

n=~N 

We saw in Book I an elegant expression for these partial sums in terms 
of convolutions, namely 


Sn(J)(x) = (f*D N )(x) 


where 


N 


Dn(x) = 22 


e 


inx 


、=-N 


sin[(7V + l/2)a:] 
sin(a:/2) 


is the Dirichlet kernel, and 


►7T 


，丌 


(/* 分 )㈤ 


27T 


- y)dy 


►7T 


2n 


y)g(y)dy 


is the convolution on the circle. 


Theorem 2.2 Let B denote the Banach space of continuous functions 
on [—7r, 7r] with the sup-norm. 

(i) Given any point xq G [—7r, tc], there is a continuous function whose 
Fourier series diverges at xq. 

(ii) In fact，the set of continuous functions whose Fourier series diverge 
on a dense set in [—7r, tt] is generic in B. 

For a stronger version of these results, see Problem 3. 

We begin with (i), and assume without loss of generality that = 0. 
Let denote the linear functional on B defined by 

W) = S N (f)(0) = ^ r f(-y)D N (y)dy. 

v — 7T 

If (i) were not true, then sup N Kn(/)| < oo for every f G B. Moreover, if 
we knew that each is continuous, the uniform boundedness principle 
would then imply that sup N \\^n\\ < oc. The proof of (i) will thus be 
complete if we can show that each ^ is continuous yet p ； v|| — oc as TV 
tends to infinity. 
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Now, 4 is continuous for each TV, since 

IM/)I \ D ^(y)\ d v 

<L N \\f\l 


where we have defined 


Ln = 



D N (y)\dy. 


In fact, the norm of the linear functional ^ is precisely equal to the 
integral Ln. 

Lemma 2.3 \\£n || = for all > 0. 

Proof. We already know from the above that ||-^n|| £ To prove 
the reverse inequality, it suffices to find a sequence of continuous functions 
{fk} so that ||/fc|| < 1, and iN[fk) — Ln as /c — oo. To do so, first let g 
denote the function equal to 1 when is positive and —1 when Dj^ is 
negative. Then g is measurable, ||^|| < 1, and 


Ln = 



g(—y)D N iy)dy 


where we used the fact that is even, hence g(y) = g 、一 y). Clearly, 
there exists a sequence of continuous functions {fk} with — 1 < /fc(x) < 1 
for all —7r ^ x ^ 7T y and so that 



fk{y) -g{y)\dy 0 


as /c — oc. 


As a result, we find that lN[fk) — as — oc, while ||/^|| < 1, hence 

as desired. 

The proof of part (i) in the theorem will be complete if we can show 
that ||^at|| = Ln tends to infinity as TV — oc. This is precisely the content 
of our final lemma. 


Lemma 2.4 There is a constant c > 0 so that > clog N. 

Proof. Since I siny\/\y\ < 1 for all y, and siny is an odd function, we 
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see that 1 


•丌 


Lj\j > c 


sin(7V + l/2)y 


0 \y 

,(7V+1/2)tt 


dy 


> c 


sin a: 


dx 



x 

sin a: 


dx 


> c 


fc=0 


X 

(/c + 1)tt J kn 


sin a: 


dx. 


However, for all k we 


have £ +1) 


sin a: 


dx = Jo I 


sin a:I dx, so that 


N-l 

ln > c 

k=0 


k ~h 1 


> c log 7V ， 


as was to be shown. 

The proof of (ii) in Theorem 2.2 is immediate. Indeed, part (ii) of the 
uniform boundedness principle, together with what we have just shown, 
guarantees that the set of continuous functions f for which 
SUp Ar |5 A r(/)(0)| < oc is of the first category, and consequently, the set 
of functions whose Fourier series converges at the origin is also of the 
first category. Therefore the set of functions whose Fourier series di¬ 
verges at the origin is generic. Similarly, if {a ： i, ...} is any countable 

collection of points in [― 丌，丌 ], then for each j, the set Fj of continuous 
functions whose Fourier series diverge at Xj is also generic. Hence the 
set F Xj which consists of continuous functions whose Fourier series 
diverge at every point a ： 2 , • • •, is also generic, and the proof of the 
theorem is complete. 


3 The open mapping theorem 

Let X and Y be Banach spaces with norms || . \\x and || - ||y respectively, 
and T : X ^ Y a mapping. Observe that T is continuous if and only if 
{x E X : T(x) G O} is open in X whenever O is open in Y. This holds 
regardless of whether T is linear or not. In particular, if T has an inverse 
S :Y ^ X that is also continuous, the above observation applied to 5 


1 In this calculation, the value of the constant c may change from line to line. 
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shows that the image by T of any open set in X is open in y. A mapping 
T that maps open sets to open sets is called an open mapping. 

We recall that a mapping T : X ^ Y is surjective if T(X) = y, and 
injective if T(x) = T(y) implies x — y. Also, T is bijective if it is both 
surjective and injective. 

A bijective mapping has an inverse T— 1 : Y X defined as follows: if 
y then T~ l {y) is the unique element x e X so that T(x) = y. This 
definition is unambiguous precisely because T is surjective and injective. 
In general, if T is linear, then the inverse T _1 is also linear, but T— 1 
need not be continuous. However, by the previous observation, we see 
that T— 1 will be continuous if T is an open mapping. The next result 
says that surjectivity guarantees openness. 

Theorem 3.1 Suppose X and Y are Banach spaces，and T : X ^ Y is 
a continuous linear transformation. If T is surjective，then T is an open 
mapping. 

Proof. We denote by Bx(x,r) and Sy (y, r) the open balls of radius r 
centered at a: G X and y GY respectively, and we write simply Bx(r) 
and By(r) for the open balls centered at the origin. Since T is linear, 
it suffices to show that T(Bx(^)) contains an open ball centered at the 
origin. 

First, we prove the weaker statement that T(Bx ⑴) contains an open 
ball centered at the origin. To see this, note that since T is surjective, 
we must have 

OO 

Y=\jT(B x (n)). 

n=l 

By the Baire category theorem, not all the sets T(Bx(n)) can be nowhere 
dense, so for some n, the set T(Bx{n)) must contain an interior point. 
As a result of the fact that T is linear, this implies that 

T(B x (1))d B Y (yo,e) 

for some 2 /o G y, and e > 0. By definition of the closure, we may pick 
a point 2 /i = T(x\) where a：i G Bx{l) and ||yi — Vo\\y < e/2. Then, if 
y G By (e/2), we find that y — yi belongs to T(Sx(l)), and writing y = 
T(x\) y — yi we find that y G T(Bx(2)). Therefore, the ball Sy (e/2) 
is contained in T(Bx(2)). Using once again the fact that T is linear, we 
see that By (e/4) is contained in T(Sx(l)), and this proves the weaker 
claim. In fact, replacing T by (4/e)T, we may assume that 

⑹ T(B x (1))dB y (1), 
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and consequently 

(7) T(B x (2- k )) D B Y (2 - k ), for all/c. 

Next, we strengthen the result and show that in fact 
⑻ T(B x (l))DB Y (l/2). 

Indeed, let y E Sy(1/2), and by (7) with fc = 1 ， select a point x\ G 
Bx{l/^) so that y — T{x\) G Sy (1/2 2 ). Then, by (7) again, applied with 
/c = 2, we may find X 2 E Bx{l/^ 2 ) so that y — T(x\) — T(a^) G B(l/2 3 ). 
Continuing this process, we obtain a sequence of points a ： 2 ,...} so 
that ||a ： fc||x < l/2 fc . Since X is complete, the sum a：i + a ：2 + • • • con¬ 
verges to a limit x E X with ||a:|| < l/2 fc = 1. Moreover, since we 

have 

y-T( Xl ) - T(x fc ) eSy(l/2 fc+1 ), 

and T is continuous, we find in the limit that T(x) = y. This implies (8 )， 
which then clearly implies that T(Bx(l)) contains an open ball centered 
at the origin. 

We gather two interesting corollaries to this theorem. 

Corollary 3.2 If X and Y are Banach spaces，and T : X Y is a con- 
tinuous bijective linear transformation, then the inverse T— 1 : Y X 
ofT is also continuous. Hence there are constants c ， C 〉 0 with 

c||/||x<||T(/)||y <C||/|| X for all feX. 

This follows immediately from the discussion preceding Theorem 3.1. 

Recall that two norms || • ||i and || • H 2 on a vector space V are said to 
be equivalent, if there are constants c, C > 0 so that 

c||i ；||2 < ||i；||i < C , ||t ;||2 for all v ^ V. 

Corollary 3.3 Suppose the vector space V is equipped with two norms 
II - ||i and II - || 2 . If 

ll^lli <C||t;|| 2 for all v e V, 

and V is complete with respect to both norms, then || • ||i and || - H 2 are 
equivalent. 

Indeed, the hypothesis implies that the identity mapping I : (V, || - | 丨 2 ) — 
(K, || • ||i) is continuous, and since it is clearly bijective, its inverse I : 
(K, II - ||i) (V, II - H 2 ) is also continuous. Hence c||t ;||2 < || 叫 |i for some 
c > 0 and all v £V. 
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3.1 Decay of Fourier coefficients of L^functions 


We return to the Fourier series discussed in Section 2.1 for an interesting 
application of the open mapping theorem. Recall the Riemann-Lebesgue 
lemma, which states 


lim 

|n| —oo 


1/ ⑻卜 °， 


if / G L x ([—7r,7r]), where f(n) denotes the n th Fourier coefficient of f. 2 
A natural question that arises is the following: given any sequence of 
complex numbers {a n } ne z that vanishes at infinity, that is, \a n \ ― > 0 as 
\n\ —> oc, does there exist / G ^^([― 7r ， 7r]) with f(n) — a n for all n? 

To reformulate this question in terms of Banach spaces, we let B\ — 
L x ([—7r,7r]) equipped with the L^norm, and ^2 denote the vector space 
of all sequences {a n } of complex numbers with \a n \ — >0as|n| —> oc. The 
space B 2 is equipped with the usual sup-norm ||{a n }|| 00 = sup neZ \a n 
which clearly makes B 2 into a Banach space. 

Then, we ask whether the mapping T \ B\ B 2 defined by 


T(f) = {/(n)}nez 


is surjective. 

The answer to this is negative. 

A 

Theorem 3.4 The mapping T : B\ B 2 given by T(f) = {/(n)} is lin- 
ear，continuous and injective, but not surjective. 

Therefore, there are sequences of complex numbers that vanish at in¬ 
finity and that are not the Fourier coefficients of L 1 -functions. 


Proof. We first note that T is clearly linear, and also continuous 
with ||T(/)||oo < II/IIl 1 . Moreover, T is injective since T(f) = 0 implies 
that f(n) = 0 for all n, which then implies 3 that / = 0 in L 1 . If T were 
surjective, then Corollary 3.2 would imply that there is a constant c 〉 0 
that satisfies 

(9) c||/|| L1 < 117(/)1100， for all/G 氏. 

However, if we set / = the N th Dirichlet kernel given by Dn = 
S|n|<iv einX y an d recall from Lemma 2.4 that ― > oc as 

TV —> oc, we find that (9) is violated as N tends to infinity, which is our 
desired contradiction. 


2 See for instance Problem 1 in Chapter 2 of Book III. 

3 This result can be found in Theorem 3.1 in Chapter 4 of Book III. 
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4 The closed graph theorem 

Suppose X and Y are two Banach spaces, with norms || - ||x and || - ||y 
respectively, and T : X —> y is a linear map. The graph of T is defined 
as a subset of X x y by 

G T = {(x,y)eXxY ： y = T(x)}. 

The linear map T is closed if its graph is a closed subset m X xY. In 
other words, T is closed if whenever {x n } C X and {y n } C Y are two 
converging sequences in X and Y respectively, say x n x and y n — y, 
and if T(x n ) = y n ， then T(x) = y. 

Theorem 4.1 Suppose X and Y are two Banach spaces. If T •. X — Y 
is a closed linear map，then T is continuous. 

Proof. Since the graph of T is a closed subspace of the Banach 
space X xY with the norm ||(a:, y)\\xxY = ||x||x + WvWy^ the graph Gt 
is itself a Banach space. Consider the two projections Px •• G(T) —> X 
and Py : G(T) Y defined by 

Px(^^ T(x)) = x and Py(x,T(x)) = T(x). 

The mappings Px and Py are continuous and linear. Moreover, Px is 
bijective, hence its inverse P^ 1 is continuous by Corollary 3.2. Since 
T = Py o P^ 1 , we conclude that T is continuous, as was to be shown. 


4.1 Grothendieck’s theorem on closed subspaces of L p 

As an application of the closed graph theorem, we prove the following 
result: 

Theorem 4.2 Let (X, //) be a finite measure space，that is ， /x(X) < 

oc. Suppose that: 

(i) E is a closed subspace of L P (X, ji), for some 1 < p < oo, and 

(ii) E is contained in L°°(X,//). 

Then E is finite dimensional. 

Since E C L°°, and X has finite measure, we find that E C L 2 with 

ll/IU 2 < C\\f\\ L oo whenever f E E. 



4. The closed graph theorem 


175 


The essential idea in the proof of the theorem is to reverse this inequality, 
and then use the Hilbert space structure of L 2 . 

Equipped with the L p -norm, E is a, Banach space since it is a closed 
subspace of L P (X, //). Let 

I .. E — 

denote the identity mapping /(/) = /■ Then, E is linear and closed. 
Indeed, suppose that f n — f in E and f n — g in L°°. Then, there 
exists a subsequence of {/ n } that converges almost everywhere to / (see 
Exercise 5 in Chapter 1), and therefore f = g almost everywhere, as 
desired. By the closed graph theorem there is an M > 0 so that 

(10) ||/||l- < M||/|| Lp for all f e E. 

Lemma 4.3 Under the assumptions of the theorem, there exists 乂 〉 0 
so that 

||/||l- < 乂 ll/lk 2 for all f e E. 

Proof. If 1 < p < 2, then H61der’s inequality with the conjugate 
exponents r : = 2/p and r* = 2/(2 — p) yields 



Since X has finite measure, we see after taking p th roots in the above, 
that there is some S > 0 so that ||/||lp < SH/11^2 for all f ^ E. Together 
with (10), this proves the lemma when I < p <2. 

When 2 < p < oc, we note first that \f(x)\ p < ||/||^o 2 |/(t )| 2 ， and in¬ 
tegrating this inequality gives 

im/idim 

If we now use (10)，and assume that H/H^oo _ 0， we find that for some 
A > 0, we have ||/||l«> < ^4||/||l 2 whenever / G E, and the proof of the 
lemma is complete. 

We now return to the proof of Theorem 4.2. Suppose /i，...，/ n is an 
orthonormal set in L 2 of functions in E, and let B denote the unit ball 
in C n , 

71 
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For each G B, let f((x) = Cjfji x )- By construction we have 

ll/dk < 1, and the lemma gives H/^Hl 00 ^ 人 Hence for each (，there 
exists a measurable set Xq of full measure in X (that is, f^(X^) = "w)， 
so that 


(11) \fc( x )\ ^ ^ for all x ^ X^. 

By first taking a countable dense subset of points in B, and then using 
the continuity of the mapping ( 卜> f((x), we see that (11) implies 

(12) \f^i x )\ ^ ^ for all x E and all C ^ B 
where X f is a set of full measure in X. From this, we claim that 

n 

(13) \fj( x )\ 2 ^ for all x e X f . 

j =1 

Indeed, it suffices to establish this inequality when the left-hand side is 
non-zero. Then, if we let a : =(Ej=i |/ 7 0r)| 2 ) 1/2 , and set G 二 fj(x)/a, 
then by (12) we find that for all x G X f 

~ ^2 l / 7'( x )! 2 - ^ 

(7 

J = 1 

that is，cr < 乂， as we claimed. 

Finally, integrating (13), and recalling that {/i ， ... ， /n} is orthonor¬ 
mal, we find n < A 2 ^ and therefore, the dimension of E must be finite. 

Remark. Problem 6 shows that the space L°° in the theorem cannot 
be replaced by any L q for l<q<oo. 


5 Besicovitch sets 


In Section 4.4, Chapter 7 of Book III, we constructed an example of a 
Besicovitch set (or “Kakeya set”）in R 2 ，that is, a compact set with 
two-dimensional Lebesgue measure zero that contains a unit line segment 
in every direction. We recall that this set was obtained as a union of 
finitely many rotations of a specific set: one that is given as a union of 
line segments joining points from a Cantor-like set on the line {y = 0} to 
another Cantor-like set on the line {y = 1}. Our goal here is to present 
an ingenious idea of Korner that proves the existence of Besicovitch sets 
using the Baire category theorem; in fact, it is shown that in the right 
metric space, such sets are generic. 



5. Besicovitch sets 


177 


The starting point of the analysis is an appropriate complete metric 
space of sets in R 2 . Suppose A is a, subset of R 2 and 5 > 0. We define 
the ^-neighborhood of A by 

A 6 = {x : d(x, A) < 5}， 'where d^x^ ^4.^ — y |• 

Then, if A and B are subsets of R 2 we define the Hausdorff distance 4 
between A and B by 

dist(yl, B) = inf{5 : B C A 6 and A C B s }. 

We shall restrict our attention to compact subsets of R 2 . The distance d 
then satisfies the following properties. 

Suppose B and C are non-empty compact subsets of R 2 : 

(i) dist(yl, B) = 0 if and only if A = B. 

(ii) dist(^4, B) = dist(B, ^4). 

(iii) dist(yl, C) < dist(^4, B) + dist(B, C). 

(iv) The set of compact subsets of R 2 equipped with the Hausdorff 
distance is a complete metric space. 

Verification of (i), (ii), and (iii) can be left to the reader, while the proof 
of (iv), which is a little more intricate, is deferred to the end of this 
section. 

We now restrict our attention to the compact subsets of the square 
[—1/2,1/2] x [0,1] which consist of a union of line segments joining points 
from Lq = { — 1/2 < x < 1/2, y = 0} to points on L\ — { — 1/2 < x < 
1 /2， y — 1} and spanning all possible directions. More precisely, let 1C 
denote the set of closed subsets K of the square Q = [—1/2,1/2] x [0,1] 
with the following properties: 

(i) K is a union of line segments £ joining a point of Lo to a point 
of L[ 

(ii) For every angle 6 G [—7r/4, 7 t/ 4] there exists a line segment ^ \n K 
making an oriented angle of 6 with the y-axis. 

Simple limiting arguments then show that /C is a closed subset of the 
metric space of all compact subsets in R 2 with the metric d, and conse¬ 
quently 1C with the Hausdorff distance is a complete metric space. 

Our aim is to prove the following: 
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Theorem 5.1 The collection of sets in 1C of two-dimensional Lebesgue 
measure zero is generic. 

In particular, this collection is non-empty, and in fact dense. 

Loosely stated, the key to the argument is to show that sets K in 1C 
whose horizontal slices {x : (x,y) G K} have “small” Lebesgue measure 
are generic. The argument is best carried out by using a “thickened” 
version K 71 of K • 

To this end, given 0 < yo ^ 1 and e > 0 we define IC(yo, e) as the collec¬ 
tion of all compact subsets K in 1C with the property that there exists r] > 
0 so that the ry-neighborhood K v satisfies: for every y E [yo — Vo + ^ 
the horizontal slice {x : (x, y) E K 11 } has one-dimensional Lebesgue mea¬ 
sure less than 10c, that is, 

(14) mi({x : (x,y) e K 71 }) < 10€, for all y e [y 0 - e,y 0 + e]. 5 

Lemma 5.2 For each fixed yo and e，the collection of sets /C(yo,e) is 
open and dense in 1C. 

To prove that IC(yo^ e) is open, suppose K E IC(yo ， e) and pick r] so that 
K 1 satisfies the condition above. Suppose K f E ： 1C with d\st(K, K f ) < 
r]/2. This means in particular that K f C K 1 ’ 2 , and the triangle inequal¬ 
ity then shows that (K ’)”/ 2 C K v . Therefore 

mi({x : (x,y) € (Ky /2 }) < mi({x : (x,y) e K v }) < lOe, 

and as a result K f G IC(yo, e), as was to be shown. 

To establish the rest of the lemma, we need to show that if K E 1C and 
5 > 0, there exists K f G JC(yo,e) so that dist(K, K f ) < 5. The set K f will 
be given as the union of two sets A and A!. The set A will be constructed 
by picking line segments i in and looking at the corresponding angular 
sector obtained by rotating the line segment ^ by a small angle around 
its intersection with y = yo- This will result in two solid triangles with a 
vertex on y = yo, and we shall try to control the length of the intersection 
of these triangles with any line segment parallel to the x-axis (Figure 2). 

More precisely, if TV is a positive integer, we can consider the partition 
of the interval [—7r/4, 7 t/4 ] defined by 

for n = 0,…， TV — 1. 

4 N 2 


5 The choice of 10 for the constant appearing in (14) is of no particular significance; 
indeed, smaller constants would have done as well. 
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=1 


= 2/o 
= 0 


Figure 2. Rotation of £(x^ 9) around its intersection with y 二 y Q 


Then the angles d n are uniformly spaced in [—7r/4, 7 t/ 4] and the N inter¬ 
vals defined by 

In = [On , 0 n + 7r/(2N)], 

cover [—7 t/ 4, 7r/4]. Moreover each of these sub-intervals has length equal 
to 7r/ 、 2N). 

If we use 9) to denote the line segment joining {y : = 0} to {y = 1} 
that passes through the point (x, yo) and which makes an oriented angle 9 
with the y-axis, then for each 9 n as defined above, by property (ii) of 
the set K there exists a number —1/2 < x n < 1/2 so that £(x n ^ 9 n ) G K. 
For each n = 0,… ， TV consider the compact set 

= 匕 ) £{x n) (f). 

Each S n therefore consists of (at most) two closed triangles with vertex 
at the point (x n , yo). Now let 

N 

A= \^J S n . 

n=0 

If > c/5 (for a large enough constant c), then the sets S n that are 
not entirely contained in the square Q can be translated slightly to the 
left or right so that the resulting set A belongs to Q, and moreover so 
that every point in 乂 is at a distance less than 5 from a point in K\ that 

is Ac K s . 
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However it is not necessarily true that every point of K is close to A, 
since in defining A we have dealt only with some of the lines £(x n , 9 n ) that 
make up K. To remedy this we add a finite set of lines to obtain a set A! 
that is close to K in the Hausdorff metric. In more detail, recall that K 
is itself a union of lines, K = [j£, and let £ 6 be the ^-neighborhood of £. 
Then 1J f 5 is an open cover of K and thus we can select a finite sub cover 

U= =1 4 of 尺. We define = Um=i £ m and set 

K f = AuA f . 

Observe first that K f G 1C. Note next that by its definition, A! C but 
{A f ) s D K. Therefore (K f ) 6 D K. Also K 6 D K\ since K s D A as we 
have seen, and K s D K D A f . This shows that dist(K / , K) < 8. 

We next estimate m\{{x : (x y y) G {K 1 ) 71 }) ioi — e < y < yo by 
adding the corresponding estimates with K’ replaced by A and A! . Note 
that for fixed y the set {x : (x^y) G ^ 4 } consists of N intervals arising 
from the intersection of the horizontal line at height y, with the N trian¬ 
gles that have their vertices at height yo. By a simple trigonometric argu¬ 
ment, since \y — yo\ < e and the magnitudes of the angles at the vertices 
are 7r/(2N), each corresponding interval of A 71 has length < 8e/N + 2rj. 
Thus 

mi({x : (x ， y) G (K’)”}) < 8e + 2r]N. 

Next A! consists of M line segments, so the set {x : (x y y) G A f } con¬ 
sists of M points, and therefore the set {x : (x, y) G ( 乂 ’)”} is the union 
of M intervals of length 2r\\ this has measure < 2r]M. Altogether then 
77 ii({a: : (x, y) G (K^) 11 }) < 8e + 2"(M + TV) and we get estimate (14) 
for K f if we take r] < e/(M + N). This completes the proof of the lemma. 

We can now proceed with the final argument in the proof of the theo¬ 
rem. For each m, consider the set 

M 

1C M = p| /C(m/M,l/M). 

m=\ 

Each KLm is open and dense, and moreover if K G /Cm, each slice of K 
along any 0 < y < 1 has one-dimensional Lebesgue measure that is 
0(1/AI). Since open dense sets are generic, and the countable inter¬ 
section of generic sets is generic, the set 

OO 

/C* = 门 /Cm 

M=1 
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is generic in /C, and by the above observation if K G /C + then each slice 
K y = {x : (x, y) E ： K} (0 < ? / < 1) has Lebesgue measure 0, hence Fu- 
bini’s theorem implies that K has two-dimensional Lebesgue measure 
equal to 0. This completes the proof of Theorem 5.1. 

We conclude this section with the proof of property (iv) of the Haus- 
dorff distance, the completeness of the metric. 

Suppose {A n } is a sequence of (non-empty) compact subsets that is 
Cauchy with respect to the HaussdorfF distance; let A n = ljfc°=n and 
A = Pi 二 i 人 . We claim that A is non-empty, compact, and A n A. 

Given e > 0 there exists iVi so that dist (乂 n , ^4 m ) < e for all n,m> Ni ， 
As a result, it is clear that whenever n > Nu then Ur = n 々 C(A0 e ， 
hence A n C (A n ) 2e . This implies 

(15) A C (A n ) 2e whenever n > 

Since each A n is non-empty and compact, and since 人 +i C A n , it fol¬ 
lows that A is non-empty and compact, and moreover dist ( 人， ^4) — 0. 
Indeed, if dist ( 人， j) did not converge to zero, then there would ex¬ 
ist eo > 0, an increasing sequence of positive integers, and points 
x nk G A Uk so that d(x nk ^A) > eo. Since {x nk } C which is compact, 
we may assume (after picking a subsequence and relabeling if neces¬ 
sary) that {x nfc } converges to a limit, say x, which would clearly satisfy 
d{x,A) > Cq. But for every M, we have x nk G Am for all sufficiently 
large n^, and since Am is compact, we must have x G Am^ thus x G A. 
This contradicts the fact that d{x^A) > eo, hence dist ( 人， ^4) — 0. 

Returning to our proof of (iv), pick N 2 so that dist ( 人， ^4) < e for all 
n> N 2 - This implies that A n C A 2e for n > N 2 , therefore 

(16) A n C A 2e whenever n > 

Combining (15) and (16) yields the inequality dist(^4 n , ^4) < 2e whenever 
n > max(A^i, ^ 2 ), which implies A n A, and that concludes the proof. 


6 Exercises 

1. Below are some examples of generic sets and sets of the first category. 

(a) Let {xj}JL 1 denote an enumeration of the rational numbers in R, and con¬ 
sider the sets 


Un = (Xj 


n2^ 


^3 


n2] 


r), and U = f]U n . 


Show that U is generic but has Lebesgue measure zero. 
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(b) Use a Cantor-like set (as described, for example, in Exercise 4, Chapter 1 
of Book III) to give an example of a subset of the first category that has 
full Lebesgue measure in [0,1]. Note that automatically this subset will be 
uncountable and dense. Also, its complement is generic and has measure 
zero, giving an alternative to the set U in (a). 


2. Suppose F is a closed subset and O an open subset of a complete metric space. 

(a) Show that F is of the first category if and only if F has empty interior. 

(b) Show that O is of the first category if and only if O is empty. 

(c) Consequently, prove that F is generic if and only if F = X; and O is generic 
if and only if O c contains no interior. 

[Hint: For (a), argue by contradiction, assuming that a closed ball B is contained 
in F. Apply the category theorem to the complete metric space B.\ 

3. Show that the conclusion of the Baire category theorem continues to hold if Xq 
is a metric space that arises as an open subset of a complete metric space X. 

[Hint: Apply the Baire category theorem to the closure of Xo in X] 

4. Prove that every continuous function on [0,1] can be approximated uniformly 
by continuous nowhere differentiable functions. Do so by either: 

(a) using Theorem 1.5. 

(b) using only the fact that a continuous nowhere differentiable function exists. 

5. Let X be a complete metric space. We recall that a set is a G <5 in X if it is a 
countable intersection of open sets. Also, a set is an F a in X if it is a countable 
union of closed sets. 

(a) Show that a dense Gs is generic. 

(b) Hence a countable dense set is an but not a Gh. 

(c) Prove the following partial converse to (a). If 五 is a generic set, then there 
exists Eo C E with Eq a dense G 石 , 


6. The function 



if x is irrational 

if x = p/q is rational and expressed in lowest form 


is continuous precisely at the irrationals. In contrast to this, prove that there is 
no function on R that is continuous precisely at the rationals. 
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[Hint: Show that the set of points where a function is continuous is a Gs (see the 
proof of Theorem 1.3)，and apply Exercise 5.] 

7. Let E 1 be a subset of [0,1], and let I be any closed non-trivial interval in [0,1]. 

(a) Suppose E is of the first category in [0,1]. Show that for every /, the set 
E D I is oi the first category in L 

(b) Suppose E is generic in [0,1]. Show that for every /, the set E C\ I is generic 
in I. 

(c) Construct a set E in [0,1] so that for all the set E C\ I \s neither of the 
first category nor generic in I. 

[Hint: Consider the Cantor set in [0,1]; then in each open interval of its complement 
place a scaled copy of the Cantor set; continue this process indefinitely. For a 
related measure theoretic result, see Exercise 36 in Chapter 1 of Book III.] 

8. A Hamel basis for a vector space X is a collection Ti of vectors in X, such 
that any x ^ X can be written as a unique finite linear combination of elements 
in H. 

Prove that a Banach space cannot have a countable Hamel basis. 

[Hint: Show that otherwise the Banach space would be of the first category in 
itself.] 

9. Consider L p ([0,1]) with Lebesgue measure. Note that if / E L p with p > 1, 
then / G L 1 . Show that the set of / G L 1 so that / ^ L p , is generic. 

A more general result can be found in Problem 1. 

[Hint: Consider the set En : ={/ G L 1 : |/| < Nm{iy~ l ^ p for all intervals I}. 

Note that each En is closed and that L p C \J N En. Finally, show that En is 
nowhere dense by considering fo + eg where g(x) = x—( 1_<5 ) with 0 < 5 < 1 — 1/p.] 

10. Consider A Q (R), with 0 < a < 1. Show that the set of nowhere differentiable 
functions is a generic set in A Q (M). 

Note however that functions corresponding to the case a = 1, that is, Lipschitz 
functions, are almost everywhere differentiable. (See Exercise 32 in Chapter 3 of 
Book III.) 

11. Consider the Banach space X = C([0,1]) over the reals, with the sup-norm 
on X. Let M be the collection of functions that are not monotonic (increasing 
or decreasing) in any interval [a, 6], where 0 < a < b < 1. Prove that M is generic 
in X. 

[Hint: Let M\ a ,b] denote the subset of X consisting of functions that are not 
monotonic in [a, b] . Then M[ a> b\ is dense in X, while b 】is closed.] 

12 . Suppose X, Y and Z are Banach spaces, and T : X x Y Z is a mapping 
such that: 
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(i) For each x E X, the mapping y i—► T(x, y) is linear and continuous on Y. 

(ii) For each y E the mapping x > T(x, y) is linear and continuous on X. 
Prove that T is (jointly) continuous on X x y, and in fact, 

\\T{x,y)\\ z <C\\x\\ x \\y\\y 

for some C > 0 and all x E X and y E Y. 


13. Let (H") be a measure space, and let {/ n } a sequence of functions 
in L P (X, ji). We know from Exercise 12 in Chapter 1, that if 1 < p < oo, and 
sup n \\fn\\LP < oo, then some subsequence of {/ n } converges weakly in L p . In 
other words, there exist a subsequence {fn k } of {/ n }, and an / E L p , so that if q 
denotes the conjugate exponent of p, that is 1/p 1/q = 1, then 


fn k {x)g(x) dfj,(x) / f(x)g(x) dfi{x) for every g ^ L q . 

X J X 

More generally, we say that a sequence {/ n } in L p is weakly bounded if 


sup 

n 


f n (x)g(x) dfi(x) 


X 


<00 for all g € L q . 


Prove that if 1 < p < oo, and fn }■ is 8t sequence of functions in that is weakly 
bounded, then 

sup \\fn\\LP < OO. 

n 

In particular this holds if {/ n } converges weakly in L v . 

[Hint: Apply the uniform boundedness principle to t n {g) = f x fn{x)g(x) dfi(x).] 

14. Suppose X is a complete metric space with respect to a metric d, and T : 
X ^ X a, continuous function. An element x* in X is universal for T if the orbit 
set {T n (x*)}^Li is dense in X. Here T n = T oT o — o T denotes n compositions 
of T. 

Show that the set of universal elements for T in X is either empty or generic. 

[Hint: Suppose x* is universal for T, let {xj} be a dense set of elements in X, 
and let Fj t k,N = {x E X : d(T n x, yj) <l/k for some n > N}. Show that Fj 、 k、N 
is open and dense.] 

15. Let B denote the closure of the unit ball in M d , and consider the metric space C 
of compact subsets of B with the Hausdorff distance. (See Section 5.) Show that 
the following two collections are generic. 

(a) The subsets of Lebesgue measure zero. 

(b) The subsets that are nowhere dense. 
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[Hint: For (a) show that the collection of sets C so that m(C) < 1/n is open and 
dense. In fact for such a set, C c D [JjLi Qj-> where Qj are disjoint open cubes so 
that Y2 \Qj I > 1 — l/n. Now shrink the Qj. For (b) fix an open set O and show 
that the collection Co of sets in C that contain O is closed and nowhere dense.] 


7 Problems 

1. Let T : 谷 i — 谷 2 be a bounded linear transformation of a Banach space B\ to a 
Banach space 82 - 

(a) Prove that either T is surjective, or the image T{B\) is of the first category 
in 62 . 

(b) As a consequence, prove the following: Suppose (X, ji) is a finite measure 
space, and 1 < pi < P 2 < 00 . One has of course L P2 (X) C L P1 (X). Show 
that L P2 (X) is a set of the first category in L P1 (X) (except in the trivial 
case for which each element of L P1 belongs to L P2 ). 

[Hint: For (a), assume that T{B\) is of the second category and use an argument 
similar to the proof of Theorem 3.1 to show that the image under T of a ball 
centered at the origin of B\ contains a ball centered at the origin in 82 ] 

2. For each integer n > 2, let A n denote the set of real numbers x so that there 
exists infinitely many distinct fractions p/q so that 

\^~p/q\ < i/^ n - 

Show that: 

(a) A n is a generic set in R. 

(b) However, the Hausdorff dimension of A n equals 2/n. 

(c) Hence m(A n ) = 0, if n > 2, where m denotes the Lebesgue measure. 

The elements of A a> 2 A n are called the Liouville numbers. While it is not 
difficult to see that every element of A is transcendental, it is a deeper fact that 
the same holds for each element of A n when n > 2. (Note that in the case n = 2, 
the set A consists of the irrationals.) 

3. Consider the Banach space B of continuous functions on the circle (with the 
sup-norm). Prove that the set of / in ^ whose Fourier series diverges in a generic 
set on the circle, is itself a generic set in B. 

[Hint: Choose {xi} dense in [0,1], let Ei = {f E B : sup N |5W(/)(Xi)| 二 00 }, and 
E = C\Ei. Then E is generic. For each f E ： E, define O n = {x : |SW(/)(x)| > 
n some N}. Show that 门 is generic.] 

4. Let D denote the open unit disc in the complex plane, and let A be the Banach 
space of all continuous complex-valued functions on D that are holomorphic on D, 
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equipped with the sup-norm. Then, the space of functions in A which cannot be 
extended analytically past any point of the boundary of D is generic. To prove 
this statement establish the following: 

(a) The set An = {f ^ A ： \f{e x6 ) — /(1)| < 7V| 沒 |} is closed. 

(b) An is nowhere dense. 

[Hint: For (b) use the function fo(z) = (1 — 2 )" 2 and consider / + e/o.] 


5. Let I = [0,1] denote the unit interval, and C°°(I) the vector space of all smooth 
functions on I equipped with the metric d given by 


°^(/» 9 ) = 


:o 


Pn(f ~ 9) 

+ Pn(/ — 夕 ）’ 


where p n (h) = sup x€/ |/i( n )(x)|. A function / G C°°(7) is analytic at a point x。G 
J, if its Taylor series 


E 


0 (xo) 


：0 


n! 


{x - x 0 ) n 


converges in a neighborhood of xq to the function /. The function / is said to be 
singular at xq if its Taylor series diverges at xq. 


(a) Show that (C°°(7), d) is a complete metric space. 

(b) Prove that the set of functions in C°°(I) that are singular at every point is 
generic. 

[Hint: For (b), consider the set Fk of smooth functions / that satisfy |/( n )(x*)|/n! $ 
K n for some x* and all n, and show that Fk is closed and nowhere dense.] 


6. The space L°° in Theorem 4.2 cannot be replaced by any L q , with 1 < ^ < oc. 
In fact there exists a closed infinite dimensional subspace of L 1 ([0,1]) consisting 
of functions that belong to L q for all \<q<oo. 

[Hint: One may use Exercise 19 in the next chapter.] 


7* As an application of Exercise 14, let 7i denote the vector space of entire func¬ 
tions, that is, the set of functions that are holomorphic in all of C. Given a compact 
subset K of the complex plane and / E let ||/||/c = sup zeK \f(z)\. If K n denotes 
the closed disc centered at the origin and of radius n, define 


d(f,g)= 


1 II /- 叙 

^2-l + Wf -g\\ K 


whenever /, ^ € 


Then d is a metric, and is a complete metric space with respect to d. Also, 
d(/ n , /) — > 0 if and only if f n converges to / uniformly on every compact subset 


of C. 



7. Problems 


187 


Birkhoff’s theorem (Problem 5, Chapter 2, Book II) states that there exists an 
entire function F so that the set {F(z + is dense in 7i. Also, MacLane’s 

theorem (see the end of the same problem in Book II) says that there is an entire 
function G so that the set of its derivatives {G( n )( 2 )}^Li is dense in 7i. 

By Exercise 14, the set of functions in 7i with either of these properties is generic 
in hence the set of entire functions with both properties is also generic. 



Rudiments of Probability 
Theory 


The whole of my work in probability theory together 
with Khinchin，in general the whole first period of my 
work in this theory was marked by the fact that we 
employed methods worked out in the metric theory of 
functions. Such topics as conditions for the applica¬ 
bility of the law of large numbers or a condition for 
convergence of a series of independent random vari¬ 
ables essentially involved methods forged in the gen¬ 
eral theory of trigonometric series... 

A. N. Kolmogorov, ca. 1987 


One owes to Steinhaus the definition of independent 
functions, whether there are finitely or infinitely many. 
It follows from this definition, first published here, 
that certain systems of orthogonal functions... (in¬ 
cluding) those of Rademacher, consist of independent 
functions. 

M. Kac, 1936 


The simplest way to introduce the basic concepts of probability theory 
is to begin by considering Bernoulli trials (for example, coin flips) and 
inquire as to what happens in the limit as the number of trials tends to 
infinity. Essential here is the idea of independent events that is subsumed 
in the more elaborate notion of mutually independent random variables. 1 

The case of Bernoulli trials where each flip has probability 1/2 can 
be translated as the study of the Rademacher functions. As we will 
see, the properties of these mutually independent functions lead to some 
remarkable consequences for random series. In particular, when a formal 
Fourier series is randomized by the Rademacher functions there is then 
the following striking instance of the “zero-one law” ： either almost every 


1 We prefer to use the terminology “function” instead of “random variable，，in much of 
what follows. 
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resulting series corresponds to an L p function for every p < oo, or almost 
none is the Fourier series of an L 1 function. 

From this special set of independent functions we turn to the aspects 
of the general theory, and our focus is on the behavior of sums of more 
general independent functions. In the first instance, when these functions 
are identically distributed (and square integrable) we obtain the “central 
limit theorem” in this more extended setting. We also see that there is a 
close link with the ergodic theorem, and this allows us to prove one form 
of the “law of large numbers.” 

Next we consider independent functions that are not necessarily iden¬ 
tically distributed. Here the main property that is exploited is that the 
corresponding sums form a “martingale sequence.” In fact, an interesting 
case of this was seen in the analysis of sums involving Rademacher func¬ 
tions. Of importance at this point is the maximal theorem for martingale 
sequences, akin to the maximal theorem in Chapter 2. 

We conclude this chapter by returning to Bernoulli trials, now inter¬ 
preted as a random walk on the line. It is natural to consider the anal¬ 
ogous random walks in d dimensions. For these we find some striking 
differences between the cases d <2 and d > 3, in terms of their recur¬ 
rence properties. 


1 Bernoulli trials 

An examination of some questions related to coin flips give the easiest 
examples of some of the concepts of probability theory. 

1.1 Coin flips 

We begin by considering the simplest gambling game. Two players, A 
and B, decide to flip a fair coin N times. Each time the coin comes up 
“heads” player A wins one dollar; each time the coin comes up “tails” 
player A loses a dollar. Since each flip has two possible outcomes, there 
are 2 N possible sequences of outcomes for their game. If we take into 
account the resulting possibilities, a question that arises is: what are 
(say) player A’s chances of winning, and in particular, his chances of 
winning k dollars, for some kl 

To answer this question we first formalize the above situation and 
introduce some terminology whose more general usage will occur later. 
The 2 n possible scenarios (or “outcomes”) under consideration can be 
thought of as points in the iV-fold product of the two-point space 
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Z 2 = {0,1}，with 0 standing for heads and 1 for tails. That is, 

^2 ={x — (a ： i,... with Xj = 0 or 1 for each 

If we assume that flipping heads or tails, at the n th flip，is equally proba¬ 
ble (and hence each has probability 1/2) for every n, we are then quickly 
led to the following definitions: The space TL^ is our underlying u prob- 
ability space ”； on it there is a measure m, the “probability measure” 
which assigns measure 2~ N to each point of and m ( 蚁 ） = =1. We 
note that if E n denotes the collection of events for which the n th flip is 
heads, E n = {x E : x n = 0}, then m(E n ) = 1/2 for all 1 < n < TV; 
also m(E n fl E m ) = m(E n )m(E m ), for all n, m with n ^ m. The latter 
identity reflects the fact that the outcomes of the n th and m th flips are 
“independent.” 


We also need to consider certain functions on our probability space. (In 
the parlance of probability theory, functions on probability spaces are of¬ 
ten referred to as random variables; we prefer to retain the designation 
“functions.”）We define the function r n to be the amount player A wins 
(or losses) at the n th flip，that is, r n (x) = 1 if x n = 0, and r n (x) = —1 if 
x n = 1, where x = ... ， x n ). The sum 


N 

Sn{x) = S(x) = y^r n (a:) 

n=l 


gives the total winnings (or losses) of player A after N flips. 


Next, let us get an idea of what is the probability that S(x) = /c, for 
a given integer k. If a given point x G has N\ zeroes and N 2 ones 
among its coordinates, (that is, player A has N\ wins and N 2 losses), 
then of course S(x) = k means k = Ni — N 2 、while Ni N 2 = N• Thus 

Ny = [N k)/2 and N 2 = (N - /c)/2, 


and k has the same parity as N. To proceed further, we assume that N 
is even; the case of N odd is similar. (See Exercise 1.) 

Thus in our probability space one has as many points x for which 
S(x)= - k as ways one can choose N\ zeroes when making N choices 
among either 0 or 1. This number is the binomial coefficient 

f N\ — N\ — N\ 

V^vJ = N^N-N.y. = (爭 )!（ 早)! ' 
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As a result, since each point carries measure 2 - N 、we have that 


⑴ 


m({x : S(x) = /c}) = 2~ 


N 


N\ 




What can we say about the relative size of these numbers as k varies 
from —N to N, (with k even)? The smallest values of (1) are attained 
at the end-points, k = —N or /c = TV, with m({x : S(x )= AO ) 二 : m({x : 
S(x) = —TV}) = 2~ N . As k varies from —TV to 0 (with k even), m({x : 
S{x)= : /c}) increases, and then decreases as k increases from 0 to N, 
This is because 


m({x : S(x) = k - 2}) N — k 
m({x : S(x) = k}) TV +/c + 2 


and the right-hand side is greater than 1 or less than 1 according to 
whether /c < —2 or /c > 0, respectively. Thus clearly (1) attains its max¬ 
imum value at /c = 0, and this is 


2 


■N 


N\ 


((N/2)\) 


2 


By Stirling’s formula (more about this below), this quantity is approxi¬ 
mately which is much larger than the minimum value 2~ N . 


With this, we leave these elementary considerations and begin to deal 
with the questions of probability theory that arise when we pass to the 
limiting situation N — oo. 


1.2 The case N — oo 

Here we take our probability space to be the infinite product of copies 
of Z 2 , which is written as Z^°, and which we denote more simply as X. 
That is, 

X = {x = (a ： i, • • • ， x n ,...), each x n = 0 or 1 for all n > 1}. 

The space X inherits the natural product measure from each of the 
measures of the partial products (in turn from each of the factors 
Z 2 ) above as follows. A set 五 is a cylinder set in X whenever there is a 
(finite) N and a set E f G so that x ^ E \i and only if (a ： i,..., G 
E,• With this definition the collection of cylinder sets together with their 
finite unions and intersections, and complements, forms an algebra on X. 
The main point now is that the function m defined first on these sets 
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by m(E) = m ； v ( 五 ’)，(where tun = m is the measure on described 
in the previous section) extends to a measure on the a-algebra of sets 
generated by the cylinder sets. Clearly m(X) = 1. (In this connection, 
the reader may consult Exercises 14 and 15 in Chapter 6 of Book III.) 

More generally consider a pair (X, m), where we are given a a-algebra 
of subsets of X (the “measurable” sets, or “events”）and a measure m 
on this (j-algebra, with m(X) = 1. Adopting the terminology used pre¬ 
viously, we refer to X as a probability space and m as a probability 
measure. In this context, one uses the terminology “almost surely” to 
mean “almost everywhere.” 

Returning to the case X — Zf with the product measure defined 
above, we can extend to it the functions r n , for all 1 < n < oc. This 
means that we take r n (x) = 1 — 2x n , where x = (a ： i,..., x n ,...) and 
x n = 0 or 1, for each n. These functions may also be viewed as set¬ 
ting up a correspondence between X and the interval [0,1], with the 
measure m then identified with Lebesgue measure on this interval. In 
fact, consider the mapping D : X [0,1] given by 

OO 

(2) D : (a ： i,..., x n ,.. ^ G [0,1]. 

j=i Z 

The correspondence D becomes a bijection from X to [0,1] if we remove 
the denumerable sets Z\ and Z 2 respectively from X and [0,1], with Z\ 
consisting of all points in X whose coordinates are all 0 or all 1 after a 
finite number of places; and Z 2 consists of all dyadic rationals (points 
in [0,1] of the form £/2 m , with £ and m integers). Moreover, note that 
if E CZ is the cylinder set Ej —— ^oc i ^j ~ — i<j< N} where the 

(lj are a given finite set of 0’s and l’s, then m(E) = 2~ N . Moreover, 
D maps E to the dyadic interval , with i = 2 N ~^aj. Of 

course this interval has Lebesgue measure 2- N . From this observation, 
the assertions about the correspondence of X with [0,1] follow easily. 

The identification of X with [0,1] allows us to write the functions r n 
also as functions of ^ G [0,1] (each undefined on a finite set); thus we shall 
write r n (x) or r n (t) interchangeably (with x G X.ortG [0,1]). Note that 
r\(t) — 1 for 0 < 亡 < 1/2, and r\(t) = —1，for 1/2 〈亡 < 1. Also if we 
extend r\ to R by making it periodic of period 1, then r n (t) = r\{2 n ~ l t). 
The functions {r n } on [0,1) are the Rademacher functions. 

The critical property enjoyed by these functions is their mutual inde¬ 
pendence, defined as follows. Given a probability space (X ， m), we say 



1. Bernoulli trials 


193 




Figure 1. The Rademacher functions r\ and 7*2 


that a sequence {/ n }^Li of real-valued measurable 2 functions on X are 
mutually independent if for any sequence of Borel sets B n in R 


( 3 ) 


m 



f n (x) E B n } = II m({x : f n (x) e B n }). 


Similarly, we say that a collection of sets {E n } are mutually independent 
if their characteristic functions are mutually independent. There is of 
course a similar definition of mutual independence if we are given only a 
finite collection /i,..., /n of functions or a finite collection E\,En 
of sets. Note that for a pair of sets E n and E m , this notion coincides with 
what has been previously encountered. However a collection of functions 
(or sets) need not be mutually independent, even if they are pair-wise 
so. (See Exercise 2.) Also, note that if /i ,... ， /n are (say) bounded and 
mutually independent functions, then the integral of their product equals 
the product of their integrals, 


⑷ 



fi(x) - . • f n (x) dm 





f\{x) dm 



f n (x)dm 


This follows by first verifying the identity directly when the /’s are finite 
linear combinations of characteristic functions and then passing to the 
limit. 

A general way that independent functions arise is as follows. Suppose 
our probability space (X ， m) is a product of probability spaces (X n , m n ), 


2 All functions (and sets) that arise are henceforth assumed to be measurable. Also we 
keep to the assumption that our functions (random variables) are real-valued, except in 
Section 1.7 and Section 2.6 onwards. 
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n = 1, 2, • • • with m equal to the product measure of the m n . Assume 
that the function f n {x), defined for a: E X, depends only on the n th co¬ 
ordinate of x, that is fn(x) = F n (x n ), where each F n is given on X n , and 
x = (a ： i, a ： 2 , • • •,Xm ...). Then the functions {/ n } are mutually indepen¬ 
dent. To see this set E n = {x : f n (x) E B n } with E n C X, similarly 
E n = i x n : Fn{x n ) ^ B n ] with <Z X n . Then E n = {x : x n e E f n } is 
a cylinder set with m(E n ) = m n (£ , ^ l ). Hence it is clear that for each N 

( N \ N N 

n 五 n ) 二 n mn ^ E n) = n 爪 ( 仏 ) . 

n=l / n=l n=l 

Letting TV —^ oo gives (3), proving our assertion. This obviously applies 
to the Rademacher functions, showing their mutual independence. 

Incidentally, this example of mutually independent random variables 
in a way represents the general situation. (See Exercise 6.) 


1.3 Behavior of Sn as TV — oo, first results 

After these preliminaries we are ready to consider the behavior of 

N 

S N (x) = y^r n (x), 

n=l 

which represents player A’s winnings after N flips. It turns out that 
the order of magnitude of as N oo, is essentially much smaller 
than N. A hint of what is to be expected comes from the following 
observation. 


Proposition 1.1 For each integer N H 

(5) H^IU^TV 1 / 2 . 


This proposition follows from the fact that {r n (i)} is an orthonormal 

system on L 2 ([0,1]). Indeed, we have that r n (t) dt = 0 because each 
r n is equal to 1 on a set of measure 1/2, and equal to —1 on a set of 
measure also 1/2. Moreover, by their mutual independence and (4), we 
have 

r n (t)r m (t) dt = 0 if n # m. 



In addition, we obviously have dt = 1. Therefore 


N 

〉: a n T n 

n=l 


心 N 

L 2 n=l 


2 
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and the assertion follows by taking a n = 1 for 1 < n < 

Note: The sequence {r n } is far from complete in L 2 ([0,1]). See Exer¬ 
cises 13 and 16. 

As an immediate consequence we have the convergence of the averages 
Sn/^ to 0 “in probability.” The relevant definition is as follows. One 
says that a sequence of functions {/ n } converges to / in probability, 
if for every e > 0, 

m({x : \/n(x) — f(x)\ > e}) — 0 as iV — oc. 3 

Corollary 1.2 S^/N converges to 0 in probability. 

In fact, 

m({\S N (x)/N\ > e}) = m({|SV ㈤ | > eN}) < J |5W0r)| 2 dm ， 

by Tchebychev’s inequality. Hence m({x : |5^v ㈤ /iV| > ^})< "(〜)， 
and the corollary is proved. It is to be noted that by the same argument 
one gets the better result that 5iv/^V Q — > 0 in probability as iV —> oc, 
as long as a > 1/2. A stronger version of this conclusion is given in 
Corollary 1.5 below. 

1.4 Central limit theorem 

The identity (5) suggests that the way to look more carefully at Sn for 
large N is to normalize it and consider instead /N 1 ^ 2 . Studying the 
limit of this quantity in the appropriate sense leads us to the central 
limit theorem. This is expressed in terms of the notion of distribution 
measure of a function, defined as follows. Whenever / is a (real-valued) 
function on a probability space (X, m)，its distribution measure is defined 
to be the unique (Borel) measure // = /// on M that satisfies 

灿）二 - m({x : f(x) G B}) for all Borel sets B C R. 

Note that a distribution measure is automatically a probability measure 
on R，since /i(M) 二 1. Incidentally the distribution measure is closely 
related to the distribution function A that appeared in Section 4.1 of 
Chapter 2, because 

入 /㈤ 二 m{{x : \f{x)\ > a}) = 


3 In measure theory, this notion is usually referred to as “convergence in measure.” 
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The argument used there to prove (29) can also be applied to estab¬ 
lish the following assertions. First, / is integrable on X precisely when 
M < oc, and then f x f(x) dm = t Similarly, / is in 

L P (X, m) exactly when / 二 0 \t\ p dfi{t) is finite and this quantity equals 

\\f\\l^ ’ 

More generally, if G is a non-negative continuous function on R (or 
continuous and bounded), then 

(6) j G(f)(x)dm= I G(t) dfi(t). 

J x Jr 

See Exercise 12. 

We say (using the parlance of probability theory) that / has a mean if 
/is integrable, and its mean mo (also called its expectation) is defined 
as 


mo = / f(x) dm 
x 




t dji{t) 


If / is also square integrable on X, the we define its variance a 2 by 


•2 


a~ = I (f(x) — mo) 2 dm. 
x 


In particular, if mo = 0, then 


*oo 


a 


2 


f 


2 

L 2 


t 2 djl{t) 


A measure " that arises naturally in this context is the Gaussian 
(or normal distribution), the measure on R whose density function is 
_i e -t 2 /2 that is, 

V27T ? 5 


•b 


咖， 6)) 


,2 


a >/27r 


e 


I 2 dt. 


More generally, the normal measure with variance a 2 is the one given by 


.6 


^2 ((a, 6)) 


e— t2 咖 2 Ut. 


a ^V27T 
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1.5 Statement and proof of the theorem 

We can now come to De Moivre’s theorem, the central limit theorem in 
the special context of coin flips. It states that the distribution measure 
of Sjsi /N 1 ^ 2 converges to the normal distribution in the following sense. 

Theorem 1.3 For each a < b, we have 

fb g—1 2 /2 

m({x: a< S n (x)/N 1 ^ 2 <&})—/ as N — OO. 

J a V 27T 

In proving this result we consider first the case when we restrict ourselves 
to N even; the limit when N is restricted to be odd is, except for small 
changes, treated the same way. Joining the two cases will give the desired 
result. 

Proof. According to (1), with /c = 2r, r an integer, and a < (3, 
m({x : a < S N {x) < (3})= ^ P r , where P r = {N/2 +r)\(N/ 2 -ryr 

a<2r<(3 

Hence 

m({x : a < Sn 、 x 、 /N'I 2 <b\) = P r . 

aN 1 / 2 <2r<bN 1 / 2 


With a and b fixed, this means that the r’s are restricted by r 二 0{N” 2 ). 
We claim that under this restriction 


⑺ 


Pr 


2 


y/^N 1 / 2 


e~ 2r2/N (l-hO(l/N 1/2 )) as N 


To verify this we use a version of Stirling’s formula, 4 which we state as 


N\ = \ / 2HN N+l/2 e~ N (l + 0(1/7V 1/2 )) as TV 一 oc. 


It follows from this that 

P 二 A __ i 1 _ 1 

r — V^^ 1/2 (l + 脊广 /2+r+l/2 (1 — 2r)N/2-r+l/2 
Now log(l x) — x — x 2 /2 + 0(|a:| 3 ), as a: — 0, so if 


(1 + 0(1/7V 1/2 )). 


A r = 




log(l + 2r/N), 


4 See for instance Theorem 2.3 in Appendix A, Book II. The error terms 0{\/N 1 ^ 2 ) 
can be improved; but even a weaker bound would suffice for our purpose. 
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then 

A r = (令 + r + •) ( 差 -^) + O” since r = 0{N 1 / 2 ). 

2 

Hence A r + 乂 - r = 皆 + 0[N - 1 and because 


2r 

1+ iV 


N/2+r+l/2 


N/2-r+l/2* 



we have the asserted result (7). 



Figure 2. Approximating the integral of a Gaussian 


Now e~ 2r2 / N — e~ 2t2 = 0(e~ 2t2 /N 1 ^ 2 ) if t G [r/TV 1 / 2 ,（r + l)/N 1 ^ 2 ]^ again 
because r = 0(N 1 ^ 2 ), Therefore 


丄 -2r 2 /N 二 
iVl/2 e — 


/.( r + l )/^ 1 / 2 

J r/N 1 / 2 



dt{l^rO(N~ l/2 )). 


Taking (7) into account we see that as a result 


m({x : a < Sn(^)/N 1 ^ 2 < 6}) = P r 

aN 1 / 2 <2r<bN 1 / 2 
广 6/2 c\ 

= / -=e ~ 2t2 dt^rO(N~ l/2 ) 

Ja/2 V 27T 

=[~^=e~ t2/2 dt + 0{N~ l/2 ), 

Ja 

upon making the change of variables t t/2. Letting N — oo gives our 
desired conclusion. 
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1.6 Random series 

A striking illustration of randomness inherent in the Rademacher func¬ 
tions is the observation that, although the series 1/n diverges, the 

series ( 土 ) 1 / 几 converges for “almost all” choices of 土 signs, where 

the 士 signs for the different n’s are chosen independently and with equal 
probability. 

A precise and more general formulation is as follows. 

Theorem 1.4 

(a) Suppose |a n | 2 < oc. Then for almost every t G [0,1], the se¬ 

ries Hn=i a n r n{t) converges. 

(b) However if l a n | 2 diverges, then a n r n(t) diverges for 

almost all t G [0,1]. 

Note. The fact that these conclusions must hold almost everywhere (if 
they hold on sets of positive measure) is a particular case of the “zero-one 
law.” More about this in Section 2.3. 

To prove the theorem, recall that {r n } is an orthonormal sequence 

in L 2 ([0,1]). Thus if l a n | 2 < oo, the sequence {X^ =1 a n r n (t)} 

converges in the L 2 norm, as iV — oo, to a function / G L 2 ([0,1]). For 
this / it is convenient to write 

oo N 

/ 〜 E a n r n , and set S N {f) = ^ a n r n . 

n=l n=l 

To prove the almost everywhere convergence of the 5 at, we bring in 
averaging operators that average over dyadic intervals, which are defined 
as follows. For each positive integer n the dyadic intervals of length 2~ n 
are the 2 n sub~intervals of [0,1] of the form (— , with 0 < £ < 2 n . 
These obviously form a disjoint covering of [0,1] (except for the origin). 
Now for each / that is integrable over [0,1], and every n, set 

E n (fm = ^ r) J i f(s)ds 

when t E I ， and / is a dyadic interval of length n. (Note that E n (/)(t) 
is not defined for t = 0, but this is immaterial.) 

For the functions / that arise as above (as L 2 limits of finite linear 
combinations of the r n ), there is the basic identify 


⑻ 


E^(/) = S N (f) for all N. 
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To prove this note first that Eiv(r n ) = r n if N > n. In fact, when N > 
n, each r n is constant on every dyadic interval of length 2~ N . Also 
E ； v(r n ) = 0 if n 〉 TV, since the integral of r n on each dyadic interval of 
length 2~ n vanishes. These facts are easily reduced to the case n = 1 by 
using the identity r n (t) = ri(2 n ~ 1 t). Thus we have shown that (8) holds 
for any finite linear combination of the Rademacher functions. Hence 
SnU) = E ； v(S , n (/)), if n > TV, and a passage to the limit, as n —^ oo, 
establishes (8). 

Now by the Lebesgue differentiation theorem 5 lim^^oo E^(/)(i) exists 
and equals f(t) at all points of the Lebesgue set of /, and hence almost 
everywhere. Thus by (8) the series converges almost everywhere, and 
part (a) is proved. 

Before we turn to the converse, part (b), we digress to strengthen 
the conclusion obtained in Section 1.3. There we considered the sums 
SW ⑷ = Yln=i r n(t) and showed that Sn/N —^ 0 in probability. This 
initial conclusion is itself implied by the “strong law of large numbers,” 
which in this case takes the following form. 

Corollary 1.5 Let 5^ ⑴ = Yln=i r n ⑴. Then SN(t)/N -^0 ， as N — 
oo for almost every t. In fact, if a > 1/2 ， then /N a 0 for almost 
every t. 


Proof. Fix 1/2 < (3 < a, and let a n = n~^ and b n = Clearly 

〜 TV 

< °°_ Set SN(t ) 二 Yl n =i a n r n(t)- Then, by summation by parts, 
setting Sq = 0, we get 


N N 

= 〉 ： 二 〉 ： a n r n b 

n=l n=l 

N 

= - S n -i)b n 


n: 


N-l 


+ S n (b n — 6 n _|_i) 


n=. 


However \b n - 6 n+ i| = b n+x - b n , and Yln=i( b n+i ~ b n ) = b N - 1 = 
O(N^) while the convergence of the series a n r n (t) for almost all t 

guarantees that |5 n (^)| = 0(1) for almost every t. As a result, for those 
SW(0 = O(N^) and this implies ⑴ /A^ 一 0 for almost all t, proving 
the corollary. 


5 See for example Theorem 1.3 and its corollaries in Chapter 3, Book III. 
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We now turn to the proof of part (b) of the theorem. It is based on 
the following lemma. 

Lemma 1.6 Suppose E is a subset of [0,1] with m(E) > 0. Then there 
is a c > 0 and a positive integer Nq so that if F is any finite sum of the 
form 


F(t) = ^2 a ，n ⑴ 

n>No 


then 



\F(t)\ 2 dt >c^a 2 n . 

n>No 


Besides the orthogonality of the {r n } already used, the proof requires a 
stronger orthogonality, which again exploits the mutual independence of 
the Rademacher functions. 

For each ordered pair (n, m), with n < m, define ^n,m(^) — r n (t)r m (i). 
Then the collection {p n ， m } is an orthonormal sequence in L 2 ([0,1]). To 

see this, consider J。 1 dt. When (n, m) = (n’ ， m’）the in¬ 

tegral clearly equals 1. Now if (n,m) ^ (n’ ， m’)，but n or m equals n’ 
or m’ (in any order), then we see that the integral vanishes by the or¬ 
thogonality of the {r n }. Finally, if neither n or m equals n f or 777 / ， then 
we apply (4) to the four mutually independent functions r n , r m ， r n ' and 
r m /, establishing the assertion. 

Assuming that F is any finite sum of the form a n r n (t), we have 


hence 

⑼ 


(F ⑴ ) 2 二 〉 ^ a n r n(^) +2 〉 : a na m ” n (i)r m ⑴， 
n n<m 


(F(t)) 2 dt = m(E) XI +2 XI 


n<m 


with 7n,m = J E r n (t)r m (t) dt = Thus by the orthog¬ 

onality of the {(f n ,m} and Bessel’s inequality, 6 Y, n ,m ^n,m ^ rn(E) < 1. 
Hence for any fixed 5 > 0 (5 will be chosen momentarily), there is an Nq 
so that X^AT 0 <n<m — We apply this with Schwarz’s inequality to 
the last term on the right-hand side of (9), restricting ourselves to F's of 


6 For Bessel’s inequality, see Section 2.1 in Chapter 4 of Book III. 
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the form F(t) =： Yl n >N 0 a n r n(t). The result is that this term is bounded 

by 


1/2 


2 I ^2 (a n a m ) 2 j 8 1/2 < 26 1/2 ^ a 2 n . 

,No<n<m / n>No 

If we choose 6 so that 2S 1 ^ 2 < m ( 五 )/2, then from (9) we get 


E 


\F{t)\ 2 dt>\m{E) Y, ^ 


n>No 


and the lemma is proved with c = m(E)/2. 

To conclude the proof of part (b) of Theorem 1.4 we suppose the 
opposite, that { 知 ⑴} converges in a set of positive measure. Then 
this sequence is uniformly bounded on a set of positive measure, and 
that means that there is an M and a set with m(E) > 0, so that 
|SW ⑴ I < M for all N if t E E. As a result there is an so that, for 

all > A^o, one has ^2^ 0<u <n a n^n(t) < whenever t G E. 

The lemma guarantees that ^2 No<n<N < c~ 1 (M , ) 2 for all N, and 
letting N — oo gives us that converges. This establishes the con¬ 

tradiction and finishes the proof of the theorem. 


1.7 Random Fourier series 

The ideas above can also be used to obtain remarkable results about 
random Fourier series, that is, Fourier series on [0, 2 丌 ] of the form 

f ； 士 C，' 

n=—oo 


To parametrize the choices of 土 signs in terms of the Rademacher func¬ 
tions, we need to re-index these functions so that their indices range 
over Z. For this reason, it is convenient to change notation and write p n 
for the functions defined by p n (t) — ^2n+i(0^ if ^ ^ 0, and p n (t) = 2 n ( 亡 ) 
if n < 0, with n G Z. We allow the coefficients c n to be complex, so that 
here we deal with complex-valued functions. 

Theorem 1.7 

(a) If X]^L-oo l c n | 2 < oc, then for almost every t G [0,1] the function 

OO 

(10) J2 Pn(t)c n e ine 
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belongs to L p ([0, 2n]) for every p < oo. 

(b) IfYl^L-cx) l c n| 2 = oc ; then for almost every t G [0,1] the series (10) 
is not the Fourier series of an integrable function. 

The proof is based on Khinchin’s inequality, which like Lemma 1.6 is a 
further exploitation of the independence of the Rademacher functions. 

Suppose {a n } are complex numbers with Yl^L-oo l a n| 2 < oc. Let 
F(t) = Yl^L-oo a nPn(t), with F taken as the L 2 limit on L 2 ([0,1]) of 
the partial sums. 

Lemma 1.8 For each p < oo there is a bound 50 that 

\\F\\ LP <A p \\F\\ L 2, 

for all F e L p ([0,1]) of the form F(t) = a npn(t). 

It clearly suffices to prove the corresponding statement when the a n are 
assumed real and have been normalized so that ||F||^ 2 = ^^°oo a n = 1- 

Now observe that the defining property (3) shows that whenever {/ n } 
is a sequence of mutually independent (real-valued) functions, so is the 
sequence {$ n (/ n )}, with {$ n } any sequence of continuous functions from 
E to R. As a result the functions {e anPn ^^} are mutually independent. 
Thus if F N (t) = Yl\n\<N a nPn(t), then 
( 11 ) ~ n 

e FN ⑴ dt = [ \ e anPn(t) j ( f e anPn(t) dt 

\n=-N / n=-AT 乂人 

However, J。 1 e anPn ⑴汾二 cosh(a n ), since each p n takes values +1 or — 1 

2 

on sets of measure 1/2 respectively. Also, cosh(a:) < e x for real a: ， as a 
comparison of their power series clearly shows. Hence 

r i n 

I dt < e a ^ < < e. 

n=-N 

A similar inequality holds with the a n replaced by —a n . Altogether then 

e \ F Nit)\ dt < 2e . 

A simple passage to the limit, as TV — oc，then gives that el F (*)l is in¬ 
tegrable over [0,1], and J。 1 dt < 2e. However for each p there is a 
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constant c p so that u p < c p e u for all u > 0. Thus \\F\ \\% < 2ec p , and the 
lemma is proved with A p - = (2ec p ) 1 ^. 

We turn now to the proof of part (a) of the theorem. We may assume 
that E^L-oo |c n | 2 = 1, and set F(t) = with a n = c n e ine , and 9 

fixed. Now 

f\F{t)\^dt= f \m\^ dt < ai 

Jo Jo 

by the lemma. Thus integrating over 9 G [0, 2tt] gives 

p2n p 1 

/ / \f t m P dtdd<27rA^ 

Jo Jo 

and by Fubini’s theorem, 


2n 

\f t (9)\^de <oc y 

for almost every 亡 G [0,1], and this is what was to be proved. 

To prove the converse, part (b) in the theorem, suppose that for a 
set Ei C [0, 1 ] of positive measure we have ft{0) G 27 r]), whenever 

t G Ei. Since every function in 27 r]) has a Fourier series that is 

Cesaro summable almost everywhere, it then follows that there is a set 
E C [0,1] x [0, 27r] of positive two-dimensional measure, and an M so 
that 

(12) sup \crN{ft){^)\ < ^ for each (t, 9) G E. 

N 

Here a N is the Cesaro sum given by cr N (f t )(0) = E| n | <iV Pn{t)c n e in6 [l - 
\n\/N). However, by Fubini’s theorem, (12) holds for at least one 0。， and 
all t G where m(E) > 0. Now write c n e tne ° = a n + with a n and 
(3 n real, then apply Lemma 1.6. Thus there is an M f and an Nq so that 

sup y2a 2 n < 

N 0 <\n\<N^ 

and letting N — oo shows that Yl°^cx) a n converges. Similarly Yl°^oo 
converges and the theorem is proved. 

1.8 Bernoulli trials 

Many of the results in Sections 1.1 to 1.5 that were proved above continue 
to hold in modified form when the equal probabilities of heads and tails 
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are replaced by probabilities p and with p q = I and 0 < p < 1. This 
more general situation is often referred to as that of Bernoulli trials. 

To consider it we begin by replacing the probability measure on Zf by 
the measure m p that arises as the product measure on Z^° where for each 
factor Z 2 = {0,1} the point 0 is assigned measure p and the point 1 is 
assigned measure q. (Incidentally, whenp ^ 1/2, then under the mapping 
D : Z§° —> [0,1], the measure m p now corresponds to a singular measure 
dfip on [0,1]. For this, see Problem 1.) 

In this setting the law of large numbers takes the form that Sj^/N —► 
p — ^ as in Corollaries 1.2 and 1.5. The proof of the analog of the first 
corollary can be carried out in much the same way as before. The variant 
of the second corollary requires some further ideas and is dealt with in 
a general context in the next section. In addition, a modification of the 
proof of Theorem 1.3 gives its analog 


m p ({x : a < 


S N (x) N(p - q) 
7VV2 


<&}) 


a 


V27T 



e -t 2 /(2a 2 ) dt 


as N — 00 , where cr 2 = 1 — （p — q) 2 . 

This result is subsumed in the general form of the central limit theorem 
proved in the last part of the next section. 


2 Sums of independent random variables 

Our aim in this section is to put in a more general and abstract form 
some of the results for coin flips and Bernoulli trials dealt with in the 
first section. To begin with, we shall present a version of the law of large 
numbers. 

2.1 Law of large numbers and ergodic theorem 

Here we deduce a general form of this law from the ergodic theorem. 7 An¬ 
other version, derived from the theory of martingales, will be presented 
in Section 2.2 below. 

A sequence (/。， /i， …， / n , …） of functions is said to be identically 
distributed if the distribution measures ii n of f n (as defined in Sec¬ 
tion 1.4) are independent of n, that is, the measures m({x : / n (x) € B}) 
are the same for all n for every Borel set B. If the sequence {/ n } is 


7 A treatment of the ergodic theorem needed here can be found in Section 5* of Chap¬ 
ter 6 in Book III. 
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identically distributed and if /o has a mean (equal to mo), then of course 
all f n have a mean that equals mo. The first main theorem is as follows. 


Theorem 2.1 Suppose {/ n } is a sequence of functions that are mutually 
independent, are identically distributed, and have mean uiq. Then 


1 N-l 
n=0 


for almost every x X, as N ^ oo. 


The possibility of reducing this theorem to the ergodic theorem depends 
on the device of replacing the sequence {/ n } by another sequence that is 
“equimeasurable” with the first, in the following sense. 

Given functions /i, … ， /W，their joint distribution measure is de¬ 
fined as the measure on R N that satisfies for all Borel sets B C 


= m {{ x : (/i(x) ，…， /W(x)) e £}• 

Now suppose {g n } is a sequence on a (possibly different) probability 
space (F, m*). Then we say that {/ n } and {g n } have the same joint 
distribution if for every TV, we have 

Ph ， … ， fA B ) = PguA B ) for all Borel sets 5 C 


With this definition in hand we come to the space Y that is relevant 
here. It is the infinite product Y = R°° = Il^lo Rj ，where each Rj is R. 
On each Rj we consider the measure /i, the common distribution measure 
of the f n . Define m* to be the corresponding product measure on Y. 

We also consider the shift r : Y Y, given by r(y) = (y n+ i) 二 if 
y = Finally we take for the {g n } the coordinate functions on Y 

given by g n (y) = y n , if y = (y n )n=o- 

Everything will now be a consequence of the following four steps. 
Observation 1 . g n (r(y)) = 9n+i(y) for all n > 0; hence g n (y) = go(r n y). 
Observation 2. r is measure-preserving and ergodic. 

Conclusion 1 . lim ； v— oo 士 [ 二 :。 1 g n (y)= 爪 0 , for almost every y G Y. 

Conclusion 2. lim#— ⑺ 秦 ^2n=o fn ⑷ = 爪 o, for almost every x G X. 


The first observation is immediate. 
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That r is measure preserving means that 7n*(r~ l (E)) — m*(E) for 
every (measurable) set E <zY. Since F is a product space, it suffices to 
verify this for all cylinder sets E ，and then a simple limiting argument 
proves this for the general set E. If JS is a cylinder set, E depends 
only on the first N coordinates, for some N. This means that E = 
E f x Rj, with E f a subset of and in*(E)= "㈤( 五 ’)， 

where /i( N ) is the iV-fold product of fx on the first N factors. However 

OO 

r~ 1 (E) = Ro x E x Rj ， 

j=N + l 

with 6 E” ii and only if (y 0 , •. • ， y ， N _ l ) G E\ where y^ +1 = 

y 二 ， for 0 < n < — 1. Thus m*(r~ 1 (E)) = ^( Ar + 1 )( j R 0 x E”）= 

and the assertion m* (r~ 1 (E)) = is proved. 

The ergodicity of r follows from the fact that r is mixing, 8 which 
means 

(13) lim 7n*(r~ n (E) 0 F) = m*(E)m*(F) 

n—oo 

for all pairs of F c Y. 

To prove the mixing property it suffices, as before, to assume that both 
E and F are cylinder sets. So, for a sufficiently large N we have that 
E = E f x rijlAT ^3 an< ^ F = F f x Rj, where both E f and F f are 

subsets of n^o 1 Rj. Now, as above if n > 1, 

Tl — 1 OO 

r- n (E) = l[R j xE ff x Rj, 

j=0 j—N-\-n 

where E N is the subset of Rj that corresponds to E f . Thus if 

n > N 

n— 1 oo 

丁 - n (E) n F 二沪 x JJ Rj x E" x Rj. 

j=N j = N-\-n 

As a result m* (丁一 n (E) 0 F) = m* (E)m* (F) whenever n > N and (13) 
is established. 

It follows immediately from (13), when taking F = E ，that if E is an 
invariant set, that is 丁一 1 (E) = E almost everywhere, then m*(E) — 


8 Also referred to as “strongly-mixing ”； see Chapter 6 in Book III. 
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(m*(£ , )) 2 , so in*(E) = 0 or m*(E) = 1. Thus there is no proper subset 
of X invariant under r, and this means that r is ergodic； so our second 
observation is established. 

Now the function go is integrable on Y since 

/ \go(y)\ dm*(y) = / \y 0 \ dfx(yo) = / \f 0 (x)\dm(x) < oo, 

Jy Jr J x 

because /i is the distribution measure of /o that is integrable. We can 
now apply the ergodic theorem in Corollary 5.6 of Chapter 6, Book III, 
which gives us the first conclusion with mo 二 f y go dm* = f x fo dm. 

To deduce the second conclusion we need the following lemmas. 

Lemma 2.2 //{/at} and {gN} have the same joint distribution，then so 
do the sequences {^n(I)} and 0 N (g)}. i/ere $#(/) = ^v(/i，...， /at ) ， 
屯 N(g ) 二 : 电 ••” 9n) ，and each is a continuous function from R N 
to M. 

To see this, note that if 5 C is a Borel set, and 少二 ( 少 i ， ... ，少 at), 
then B f — is also a Borel set in R' so if / = (/i,.. •, /^) and 

9 = ( 分 1 ， •.. ，做 )， then ⑺ (5) = "/(5’）and fx^ g) (B) = fx g (B , ). Since 
f and g have the same joint distribution we must have = fjL g (B f ), 

and the lemma is proved. 

Lemma 2.3 If {Fn} and {G^} have the same joint distribution, then 
F/v(x) —> mo almost everywhere as N — oo if and only if Gn (y)— 爪 o 
almost everywhere as N — oo. 

To prove this lemma, note that if we define — {x : sup r>Ar \F r (x ) — 
mo| < 1/fc}, then Fn — mo almost everywhere if and only if 
1 , as AT —» oo, for each k. If E f N = {y : sup r>N \G r (x) — mo| < 1/fc}, 
then m{Ejsf^k) — m* [E’ N k ), and this leads to our desired result. 

Once we take 屯 … ， t N ) = ^ Ylk=i f n(x) = ^ Ylk=o 
and G^(y) = ^ Y]::: gk(y)^ we see that the lemmas complete the proof 
of the theorem. 

2.2 The role of martingales 

We shall now look at sums of independent functions (random variables) 
from a different angle and relate these sums to the notion of martingales. 
The basic definition required is that of the conditional expectation of 
a function / with respect to a cr sub-algebra A of the cr-algebra M. of 
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measurable sets of X. In fact, for the sake of brevity of terminology, in 
what follows we drop the adjective “cr” and use “algebra” and sub-algebra 
to mean a-algebra and a sub-algebra, respectively. 

Suppose j is a given such sub-algebra. We say that a function F on X 
is measurable with respect to A (or ^l-measurable) if F~ 1 (B) G A for 
all Borel subsets 5 of R. The algebra A is said to be determined by F, 
sometimes written A = Af, if A is the smallest algebra with respect to 
which F is measurable; that is, Af as B ranges over the 

Borel sets of R. 

Given an integrable function f on X and a sub-algebra A, then E^(/), 
also sometimes written as E(/|^4)，is the unique function F described by 
the proposition below. It is called the conditional expectation of / 
with respect to A. 

Proposition 2.4 Given an integrable function f and a sub-algebra A of 
M., there is a unique 9 function F so that: 

(i) F is A-measurable. 

(ii) f A F dm = J A f dm for any set A A. 

In general, one may think of the conditional expectation as the “best 
guess” of the function / given the knowledge of A. A simple example to 
keep in mind is E^(/) = E n (/) given in Section 1.6 above. In that case, 
A is the (finite) algebra generated by the dyadic intervals of length 2~ n 
on [0,1]. 

Proof. We denote by rn! the restriction of the measure m to A. 
Define a (a-finite) signed measure ^ on ^4 by v{A) = J A f dm, for A £ A. 
Then since v is clearly absolutely continuous with respect to m’，the 
Lebesgue-Radon-Nikodym theorem 10 guarantees that there is a function 
F that is .4-measurable so that i^(A) = f A F dm f = f A F dm. Given the 
definition of u, the existence of the required F is therefore established. 
Its uniqueness is clear because if G is ^l-measurable and f A G dm = 0 for 
every A ^ then necessarily G = 0. 

Once the algebra A is fixed, we shall not always indicate the depen¬ 
dence of the conditional expectation on the algebra, but write it simply 
as E instead of E^. 

There are a number of elementary observations about conditional ex¬ 
pectations E that are direct consequences of the defining proposition for 
F = E(/). We leave these for the reader to verify. 


9 Uniqueness, of course, means determined up to a set of measure zero. 

10 See for example Theorem 4.3 in Chapter 6 of Book III. 
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• The mapping / E(/) is linear. 

• f x E(/) dm = f x f dm, and E(l) = 1. 

• E(/) > 0 if / > 0, and |E(/i)| < E(/) if |/i| < /. 

• E 2 = E，and in particular E(/) = / if / is ^l-measurable. 

• E(^/) = ^E(/), if g is bounded and ^l-measurable. 

Two other noteworthy properties of E are contained in the following. 

Lemma 2.5 

(a) IffeL 2 , then E(/) G L 2 and ||E(/)|| l2 < ||/|| L 2 . 

(b) If f，g e L 2 , then f x E(f)gdm = f x fE(g) dm. 

Note. The conclusion (b) of the lemma, together with the property 
E 2 二 E shows that E is an orthogonal projection on the Hilbert space 
L 2 (X ， m). 

Proof. To establish (a) observe that if g is bounded and ^l-measurable, 
then by the proposition above, f x gf dm = f x E(^/) dm = f x ^E(/) dm. 
But 

l|E (/)|| L 2 = sup 

9 

where g ranges over bounded ^l-measurable functions with ||^||^2 < 1 (see 
Lemma 4.2 in Chapter 1), because of the fact that E(/) is ^l-measurable. 
Moreover f x gf dm\ < \\f\\L 2 for such g, gives conclusion (a). 

Next observe that f x E(g)f dm = f x E(E(g)f) dm = f x E(^)E(/) dm, 
whenever g is bounded. By symmetry in / and g this gives conclusion (b) 
when both / and g are bounded, and by the continuity in (a) the result 
extends to / and g in L 2 . 

After these preliminaries, we are ready for the task at hand. We now 
assume we are given an increasing sequence of sub-algebras of Ai, that 
is, we have 

w4o c w4i c • • • c w4 n c • • • c 

and to each sub-algebra we attach its conditional expectation, 



^E(/) dm 


E n = E An for n = 0,1,2,.... 
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The increasing character of the sequence A n implies that expectation 
operators form an increasing sequence in the sense that 

见 n 见 m 二 见 min(n ， m )， for all 71, 772. 

Indeed, if m < n, then Am C A n ^ so g = E m (/) is ^-measurable, and 
consequently E n (^) = g. In the other case, if n < m, and A G A n , then 



where the second equality follows from the fact that A is also im¬ 
measurable. Therefore the definition of conditional expectation implies 
that E n (E m (/)) = E n (/). 

With this we arrive at the crucial definition. Having fixed our increas¬ 
ing family of algebras {A n } and the resulting conditional expectations, 
we say that a sequence {«s n } of integrable functions on X forms a mar¬ 
tingale sequence if for all k and n, 

(14) = E/ C (5 n ), whenever k < n. 

Note that by this definition, each is automatically .4fc-measurable. 

If the sequence is finite (and consists of 5o, 5i,..., 5 m ) then this is 
equivalent with Sk = Efc(5 m ), for all k < m. An important class of mar¬ 
tingale sequences are those that are complete. This means that there 
is an integrable function so that = Efc(5oo) for all k. 

The fundamental connection between sums of independent random 
variables and martingales is contained in the following assertion. 

Proposition 2.6 Suppose {fk} is a sequence of integrable functions that 
are mutually independent and each have mean zero. Then there is an 
increasing family A n of sub-algebras so that with respect to these s n — 
YHq fk 切 a martingale sequence. 

To see this, we require further terminology. Let {B n } be a sequence of 
sub-algebras of M. that are not assumed to be increasing. Then these 
are said to be mutually independent if for every N, 

N N 

m (n&) = n m ( 巧） for all choices Bj G Bj. 

j=0 j=0 

Notice that if Af n are the sub-algebras determined by the / n , then the 
fact that {Af n } are mutually independent is equivalent to the functions 
{f n } being mutually independent, according to the definition given in (3). 
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Now starting with our independent functions /o, / 1 ， … ， /n， … we de¬ 
fine An to be the algebra generated by Af 0 U Af x U • • • U Af n . It is use¬ 


ful to have the short-hand to denote the algebra generated by 

So U Si U • • • U B n . Thus we have set A n = \l^ =0 Af r Our claim is that 


VJJq Ajj and Af n are mutually independent. This is an immediate con¬ 
sequence of the following lemma. 


Lemma 2.7 Suppose So,..., S n are mutually independent algebras. 
Then for each k < n, the algebras \/^ =0 Bj and B n are mutually inde¬ 
pendent. 


See Exercise 7. 

Now clearly {A n } is an increasing sequence of algebras and (/ ^) = 
fi, if fc > ^, since each jt is also ^Ifc-measurable. We next observe that 
Kk(fe) = 0 if /c < £• Indeed, recall first that = Ek{fe) is w4fc-measurable 
and 

/ Fdm = / j^dm, for every set Ak G Ak- 
j Ak ^ 

But 

/ fedm 二 XA k fedm = m(xA k ) / fe dm = 0 , 

JA k J X J X 


by the independence of Ak and Af £ , and the fact that the mean of fe is 
zero. Hence F = 0. Finally for fc < n 

Efc(«5 n ) — Efc(/o + /l + … + /fc) + Efc(/fc+i + ... + /n) 

=fo + • • • + fk = Sk. 


Thus (14) holds and the proposition is proved. 


Having reached this point, we are ready to use the ideas of martingales 
to extend the results of Section 1.6. 

Theorem 2.8 Suppose /o,..., / n ,... are independent functions that are 
square integrable，and that each has mean zero, and variance = ||/ n ||| 2 • 
Assume that 


二 4 < oo. 

n=0 


Then s n = YHq fk converges (as n — oo) almost everywhere. 
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A corollary of this where we only assume that the {<j n } are bounded ， 
gives the strong law of large numbers in this setting. 

Corollary 2.9 7/sup n a n < oo, then for each a > 1/2 


s 


n 


— 0 


almost everywhere as n — oo • 


Note that here, unlike in Theorem 2.1, we have not assumed that the 
f n are identically distributed. On the other hand, we have made a more 
restrictive assumption in requiring square integrability. 

We begin the proof of the theorem by noting that under its assump¬ 
tions the sequence s n = Y]2-q A converges in the L 2 norm, as n ^ oo. 
Indeed, since the f n are mutually independent and f x f n dm = 0, then 
by (4) they are mutually orthogonal. Hence by Pythagoras 5 theorem, 

if m<n, || 5n - s m \\l 2 = ELm+1 II /』 2 = ELm+1 d — 0, as n，m — 

oo. Thus s n converges to a limit (call it Soc) in the L 2 norm. Using (14) 
and the fact that each E n is continuous in the L 2 norm by Lemma 2.5, 
we arrive at 


Sn = E n (5oo), for all n. 


Our desired result now follows from a basic maximal theorem for mar¬ 
tingales and its corollary, which gives convergence almost everywhere. 


Theorem 2.10 Suppose Soo is an integrable function, and s n = E n (5oo )， 
where the E n are conditional expectations for an increasing family {A n } 
of sub-algebras of Then: 

(a) m({x : sup n |5 n (x)| > a}) < ^pooll^i for every a 〉 0. 

(b) If s n converges in the L 1 norm as n — oo ，then it also converges 
almost everywhere to the same limit. 


Note. The assumption in part (b) is in reality redundant because if s n = 
E n (5oo) with 5oo G L 1 , then automatically s n converges in the L 1 norm; 
but in general this limit need not be Sw (See Exercise 27.) However 
in the situation in which we apply the theorem, we know already that 
s n 5oo in the L 2 norm, hence also in the L 1 norm. 

For the proof of part (a) we may assume that 〜is non-negative, 
for otherwise we may proceed with |5oo| instead of Soc and then obtain 
the result once we observe that |E n (5oo)| < E n (|«Soo|). For fixed a, let 
A = {x : sup n s n (x) > a}. Then we can partition A = (J=o 乂 71 ， where 
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A n is the set where n is the first time that 5 n (x) > a. That is A n = {x 
5 n (x) > a, but Sk{x) < a, for k < n}. Note that A n G A n . Also, 


dm 


A 



dm 


n 



E n (5oo) dm 


n 



dm 


> a 


u 


am(A). 


dm 


n 


The identity f A ^ E n (5oo)dm = f A 


n 


Sqq dm follows from the definition of 


the conditional expectation E n (5oo). Thus 

( 15 ) m(A) < - Soo diTTL, 'Wl'th ^4 : SUp^ a}， 

a Ja 


and part (a) is proved. (The reader might find it instructive to com¬ 
pare (15) with a corresponding estimate for the Hardy-Littlewood max¬ 
imal function in equation (28) of Chapter 2.) 

To prove (b)，assume first that s n —> 方⑺ in the L 1 norm. Remark 
that we always have s n — = E n (5oo — 5^) — Soo if n > fc，because 

then E n («Sfc) = Sk. We will show that if A a = {x : limsup n _, oc |«s n (x) — 
5oo(x)| > 2a}, then in(A a ) = 0 for every a > 0, and this assures our 
conclusion about the existence of the limit. Now with a given, let e > 0 
be arbitrary. Then choose k so large that ||5^ — ^ooH^i < 6. Then 

limsup|5 n - 5oo| < sup |E n (5oo - S k )\ -f \s k - 5oo|. 

n—^oo n>k 


If A l a = {x : sup n |E n (5oo — 外 ） (x)1 > a} and Al = {x : |«s fc (x) - «Soo(x)| > 
a}, then 

m(A a ) < m ( 尤 ) + m(A 2 a ). 

By part (a) applied to _ sj^ instead of Soo, we get m(Al t ) < e/a. Also 
Tchebychev’s inequality gives m(A^) < e/a. Altogether then m(A a ) < 
2e/a, and since e was arbitrary we have m(A a )=0, which holds for ev¬ 
ery a, proving the result under the additional hypothesis that s n con¬ 
verges to 5oc in the L 1 norm. Dropping that assumption we can define 
5^ to be the limit of the sequence {«s n } in the L 1 norm which was as¬ 
sumed to exist. Then by (14) and the continuity of on the L 1 norm, 
we get Sfc = E/ c ( 5^ c ), and we are back to the previous situation with 5^ 
in place to 5^. The theorem is therefore completely proved. 

The corollary then follows by the same argument used in the proof of 
Corollary 1.5. 
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2.3 The zero-one law 

The kernel of the idea is the observation that if A\ and A 2 are two 
independent algebras, and the set A belongs simultaneously to both Ai 
and 為 ， then necessarily m{A) = 0 or m(A) = 1. 

Indeed, in this situation, m{A) = m{A n ^4) = m(A)m(A) by inde¬ 
pendence, which proves the assertion. This idea is elaborated in Kol¬ 
mogorovas zero-one law that we now formulate. 

Suppose Ao^ w4i,... ， A nj … is a sequence of sub-algebras of ， that 
are not necessarily increasing. With VfcL n denoting the algebra 11 
generated by A n ^ 人 +i，... ， we define the tail algebra to be 

OO OO 

n v 

n=0 k=n 

Theorem 2.11 If the algebras Ao,A\,..., A n ,... are mutually indepen¬ 
dent then every element of the tail algebra has either measure zero or one. 

Proof. Let B denote the tail algebra. Note that A r is automatically 
independent from V 二 r +i by Lemma 2.7. Hence each A r is inde¬ 

pendent of B, and thus the algebras B and B are mutually independent! 
Therefore as observed above, every element of B has measure zero or one. 


A simple consequence is the following. 

Corollary 2.12 Suppose / 0 , / 1 ， ... ， /n, . •. are mutually independent 
functions. The set where ^^°_ Q fk converges has measure zero or one. 

Proof. Set A n = Then these algebras are independent. Now 

with s n = and a fixed positive integer no, we have by the 

Cauchy criterion that 


OO 


{x : lim s n (x) exists}= 门 u {x : |5 n (x) — 5 m (x)| < all n, m > r}. 

£=l t=uq 


Since {x : \s n (x) — 5 m (x)| < 1/( all n, m > r} G VfcLn 0 whenever 
r > no we conclude that the set of convergence is a tail set, as desired. 


2.4 The central limit theorem 

We generalize the special case of this theorem given in Section 1.4, con¬ 
necting its proof in an elegant way with the Fourier transform. 


11 Recall that we are using “algebra” as a short-hand for “cr-algebra. 
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The setting is as follows. On our probability space (X, m) we are given 
a sequence /i, / 2 , ♦ • •, of identically distributed, square integrable, and 
mutually independent functions (random variables) that each have mean 
mo and variance a 2 . 

Theorem 2.13 Let Sn = fn. Under the above conditions 

m({x: a < SN ^ mo <b]) ^^mf a e_t2/( " 2) dt 

as N — oo, for each a < b. 

In proving the theorem we can immediately reduce to the case where 
the mean mo is zero, by replacing f n by f n — mo for each n. Suppose 
now that fx is the common distribution measure of the / n , that ⑽ is the 
distribution measure of S^/N 1 ^ 2 and is the distribution measure of 
the Gaussian with mean zero and variance a 2 . We consider the Fourier 
transforms of these measures, called their characteristic functions. In 
the case of /i it is given by 

A(0 = / e~ 27ri ^ 

J —OO 

with similar formulas 12 for jljsi and 

Note first that can be computed explicitly. It is given by the 
formula 13 


心 2 (0 


=e 


-2<J 2 7T 2 ^ 2 


The proof of the theorem can now be presented in three relatively easy 
steps: 

(i) The identity , for each N. 

(ii) The fact that AaKO — 心 2 (0, f° r each 《， as iV — oo. 

(iii) The resulting consequence that — v G ^{(a, 6)), as N oo 

for all intervals (a, b). 


12 To be consistent with our previous usage of the Fourier transform, we have kept the 
factor 2 丌 in the exponential, which is not the usual practice in probability theory. 

13 See for instance Chapter 5 in Book I. 
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Now if fx is the common distribution of the / n , then as we noted in (6), 
for any G : R ^ R that is (say) continuous and bounded we have 


G(f n )(x)dm 


G(t) dfi(t). 


In particular, taking G(t) = e _27rz ^, with 《 real, we have 


A(0 


e -2niCf n (x) dm . 


Similarly Aiv(0 = fx e- 2ni ( SN 、 x 、’ Nl/2 dm. However S N (x) = J2n=i fn ( 工 ), 
thus by the mutual independence of the f n 


e~ 27ri ^ SN{x)/Nl/2 dm = fj 


— (" 妒 /2) N 


(Note here the similarity with equation (11).) The identity (i) is therefore 
established. 


To carry out the second step we prove the following. 

Lemma 2.14 fii^/N 1 / 2 ) = 1 — 2cr 2 7r 2 《 2 /7V + o(l/N), as N — oo. 
Proof. Indeed, when ^ is fixed 

e -2ni^t/N^ 2 = 1 _ 27 ri^t/N l/2 - 27T 2 ^ 2 t 2 /N -E N (t) 

with E^it) = 0(t 2 /N)^ but also Ejsi{t) — 0(t 3 /iV 3 / 2 ). Integrating this 
in t, we get 

97t2^2 Z* 00 

A(^/iV 1/2 ) = 1 - + / E N (t) 咖⑴， 

because tuq = J^^td^t ) : = 0， and cr 2 = J 二 ⑴. The lemma will 
be proved as soon as we see that E^(t) dfi(t) = o{l/N). However, 
the integral in question can be divided into a part where t 2 < and 

a complementary part where t 2 > e^N. Here we choose 6 at to tend to 
zero as N — oo, while e^N oo as N ^ oo; (for example，the choice 
= N~ 1 ^ 2 will do.) Now for the first part 

f E N {t、dt = 0( 1 t 3 /7V 3 ’ 2 d_ 

Jt 2 <e n N \Jt 2 <e n N 

^ 0 {^N Lj 2Mt) ) 

= o{l/N). 
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In addition, for the second part we can estimate 

f E N (t) dt = o(^ f t 2 = o(l/N). 

Jt 2 >e N N \ iV Jt 2 >e N N / 

Having thus proved the lemma we see that 

MO = mN l/2 ) N = {l- 2a 2 n 2 e/N + o(l/N)) N , 

and this converges to e— 2ct27f2 《 2 , completing step (ii). 

To finish the proof of the theorem we need the following lemma. We 
say that a measure is continuous if each point has measure zero. 

Lemma 2.15 Suppose {^n}, N = 1 , 2 ,..., and v are non-negative fi¬ 
nite Borel measures on R ，and that v is continuous. Assume that ^ 

⑹， as N — oo, for each 《 G M. Then — u(a, b) for all a <b. 

Proof. We prove first that 


(16) 


PNisP) — Kw) as iV —> oo 


for any (p that is C°° and has compact support, where we have used the 
notation "#(#) = dfx N (t) and u((f) = du(t). 

Notice that since 〜⑼ =/ 二 ⑴， then the convergent sequence 


⑴ must be bounded. As a result, for some M we have ^ 

M for all N and also |^(^)| < M. 

Next, the function (p can be represented by its inverse Fourier trans¬ 
form ip(t) = e~ 2nzt (p v (^) where 〆( 《） = 0(—0 论 necessarily in 
the Schwartz space S. This shows that 





by applying Fubini’s theorem to f f e~ 27rit ^(p v (^) which is jus¬ 

tified by the rapid decrease of (f y . Similarly, Jifdu =J 
Then since An ⑹一 ^(0 pointwise and boundedly we obtain (16). 

Now for (a, b) fixed, let (f e be a sequence of positive C°° functions with 
< X(a,6), and (f e (t) X(a,b)(t) for every t as 6 - > 0. Then 


fx N ((a,b)) > ^ 咖 e ) 


3/S TV" — > OO - 
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Figure 3. The functions (p € and * in Lemma 2.15 


As a result, liminf ； v—oo "Ar(a ， 6) > "(p e ), and letting e — 0 gives 

liminf/iAr((a,6)) > u((a,b)). 

N—oo 

Similarly, let * be a sequence of C°° functions so that > X[a,b] 
and ^ e {t) X[a,6](0 for every as 6 —^ 0. Then by the same reason¬ 
ing, limsupw — ⑺ b)) < i/([a, b]) = i/((a, 6)), by the continuity of v. 

Thus the lemma is proved, and with it the theorem is established, once 
we take v — v a 2 . 

Another way to put the conclusion of the theorem is in terms of weak 
convergence of measures. We say that a sequence of probability measures 
{ 卽 } converges weakly to the probability measure v, if (16) holds for 
all continuous functions (f that are bounded on R. 

Corollary 2.16 If ㈣ is the distribution measure of(S^ — Nitiq)/N 1 ^ 2 , 
then _ converges weakly to u ^ v G i . 

We note first that (16) holds for any function (f that is continuous and 
has compact support. Indeed such a (f can be uniformly approximated 
by a sequence {(f e } of C°° functions of compact support. 14 Now 

卽⑽ - u((f) = — (fe) - 咖 - ^e) + ("TV - ^)(^e). 

Now the sum of the first two terms on the right-hand side is majorized 
by 2sup t \(p(t) — and this can be made small by choosing e con¬ 

veniently. Once 6 is chosen we need only let N oo and apply (16) 
for (p e . 


14 See for example the proof of Lemma 4.10 in Chapter 5 of Book III. 




220 


Chapter 5. RUDIMENTS OF PROBABILITY THEORY 


To pass to (f whose support is not compact, we note that 


(17) limsup/iAr(X(/fi)0 < 

AT—>oo 

where e(R) — 0 as i? —> oo, and Ir is the interval \t\ < R. In fact if tjr is 
continuous with 0 < r] R < xi R , r} R (t) = 1 for |t| < i?/2, then Hn(xi r ) > 
Pn[Vr) — Hvr), as N ^ oc. Hence lim inf^^oo ^(x/h) > 1 - ^(1 - 
rjji), but u(l — rjji) = e(R) — 0, as i? — oo so (17) holds. 

Now suppose f is a given continuous and bounded function on R. 
We can assume that 0 < (/? < 1. For each i?, let (fR be a continuous 
function with (fR(t) = (f(t) for |t| < i?, but (fR(t) = 0 for |t| > 2R, while 
0 < (fR(t) < (f(t) everywhere. 

Then (p < (p R + X(i R )^ so 

< ^n{^Pr) + ^n{X(i r ) c )- 


Therefore limsup N ^ 00 < v{ip) + e(i?), and letting i? —> oo gives 

lim sup N fi N ((p) < v{}p). However 

lim inf /iAr(^) > lim ^n^r) — ^(^r) as i? —^ oc. 

AT— >oo AT— >oo 

Thus lim^v— oo 二 以 ⑹， proving the corollary. 

2.5 Random variables with values in R d 

Up to this point, with the exception of Section 1.7, our functions have 
been assumed to be real-valued. However, for many purposes it is useful 
to extend the theory to the setting where the functions take their values 
in (and in particular, to complex-valued functions, which corresponds 
to the case d = 2). Often this extension is rather routine. In what follows, 
we will limit ourselves to a formulation of the d-dimensional version of 
the central limit theorem. First, some notation. 

Suppose / is an R d -valued function on (X, m). We write it in coor¬ 
dinates as / = (/⑴，/( 2 )，…， / ⑷)， where each / ⑻ is real-valued. The 
distribution measure of / is the non-negative Borel measure /i on 
defined by 

fi(B) = m(f~ l (B)) = m({x : f(x) G 5 })， for each Borel set B cR d . 

Of course fi(R d ) = 1， so " is a probability measure. 

The function / is said to the integrable if |/| = (j2t=i l/ W | 2 ) ’ 
is integrable. Square integrability of / is defined similarly. When 
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/is integrable then its mean (or expectation) is defined as the vector 

mo = (m^), with = f x f ⑻ (x) dm. 

If / is square integrable, the covariance matrix of / is the d x d 
matrix {a^} with 

= [ (/ ⑷⑻⑺ ㈤ — 

Jx 

Note that a^- = / Rd (^ ~ — m P)) and this matrix is sym¬ 

metric and non-negative. It has a (unique) square root a which is sym¬ 
metric and non-negative, and thus we write a 2 for the covariance matrix 
of /. 

Next, we say that the sequence of R d -valued functions, /i,..., / n ... 
are mutually independent if the algebras 

A n = w4/ n = {/ ~ 1 (5), all Borel sets B in R d } 

are mutually independent. Notice that this implies that for each vec¬ 
tor 《 =( 《 i, … ， ^d) ^ the scalar-valued functions 《 • /i ，…，《 • / n ,... 

where ^ • /n = 6 - fn ] + 6 • / 斤 ) + … + G ， fn\ are mutually indepen¬ 
dent. 

Two other preliminary matters. Given an R d -valued random variable 
(function) /, its characteristic function is the d-dimensional Fourier 
transform A( 《 )=f Rd ^ G R d , where fi is the distribution 

measure of /. Of course jl(^) = f x e -27r O( x ) dm. 

Also adapting a previous terminology, if {"at} is a sequence of proba¬ 
bility measures on R d , and v is another probability measure on R d , then 
we say that 卽 — p weakly if 

I (fd^N — I (pdu as iV ^ oo ， 

JR d JR d 

for all continuous and bounded functions (p on R d . 

We now come to the theorem. We suppose that our sequence {/ n } of 
R d -valued functions are mutually independent, that they are identically 
distributed and are square integrable with mean zero. If a 2 denotes the 
common covariance matrix, we assume that a is invertible, and write 
cr _1 for its inverse. 

Let fiN be the distribution measure of ^2n=i fn, and v G i be the 
measure on M. d given by 

/d 、 1 f I 厂 1 ⑷ i 2 , 

^ (5)= (2 7 r)^(det. ) y/" 2 如 
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for all Borel sets B C 

Theorem 2.17 Under the above conditions on {/ n }，the measures ㈣ 
converge weakly to u a 2 as N — oo. 

The proof proceeds essentially as in the case of real-valued functions, 
showing first the analog of (16) for smooth functions with compact sup¬ 
port, and then proceeding as in Corollary 2.16 for continuous functions. 
The calculation of the characteristic function of the Gaussian is given in 
Exercise 32. 

Remark. The following generalization can be deduced by a slight mod¬ 
ification of the proof of Theorem 2.17. Suppose {/ n } satisfies the condi¬ 
tions of the theorem, t > 0, and define 

x m 

Sn ^ = J^l/2 

n=l 

(Here [x] denotes the integer part of x.) Then, the distribution measure 
of SN't converges weakly to u ta 2 as TV oo. In fact, if 0 < «s < t, then 
the distribution measure of Sjsi't — Sn、s converges weakly to p ( 卜咖 2 as 
N 00 . 


2.6 Random walks 

The coin tossing (or sums of Rademacher functions) considered in Sec¬ 
tion 1.1 can be thought of as representing a random walk on the real 
line. This walk can be described as follows. 

One starts at the origin, then moves along a straight line with steps of 
unit length; each step taken has equal probability of going to the right or 
left, with different steps having independent probabilities. The position 
after the n th step is given by s n = Notice that the values of s n 

are always integers. 

In we will consider a particular version of a random walk, giving 
the simplest generalization of the above. It starts at the origin, and the 
position of the n th step is obtained from the previous step by moving 
a unit length in a direction of one of the coordinate axes, and doing 
this with equal probability, (that is probability l/(2d)). The passage at 
each step will be assumed to be independent of the previous steps. We 
formalize this situation as follows. 

Let ^ 2 d be the set of 2d points in R d labeled by {zbei, 士 e 2 ,..., zbe^}, 
where ej = (0,..., 0,1, 0,..., 0) with 1 in the j th coordinate and 0 else¬ 
where. Assign to Z 2 d the measure that gives weight l/(2d) to each of its 



2. Sums of independent random variables 


223 


points. Let X = Z 益 be the infinite product of copies of endowed 
with the product measure, and call this measure m. Thus X consists 
of points x = (x n )^L 1? where each x n G ^d- Now define r n (x) = x n for 
each n. So r n (x) is one of the 土 e』，for each n, and therefore in fact 
takes its values in the lattice Z d of R d . Also {r n } are mutually inde¬ 
pendent functions, since v n (x) depends only on the n th coordinate of x. 
Note finally that each r n has mean zero and the identity as its covariance 
matrix. 

The sums 

n 

Sn(x) = y^tfe(x) 
k—1 

represent our random walk, in that x labels a possible path and s n (x) 
gives the position of this path at the n th step. It is convenient to set 
so(x) = 0 for all x. 
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Figure 4. The random walk s n in dimension two 


Here we examine only one of the interesting properties exhibited by 
this random walk. It illustrates a significant dichotomy between the case 
of dimension d <2 and d > 3. 

Theorem 2.18 For the above random walk: 

(a) Ifd = 1 or 2, the random walk is recurrent in the sense that almost 
all paths return to the origin for infinitely many n. 
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(b) If d> then almost every path returns to the origin at most a 
finite number of times. Moreover, there is a positive probability 
that the path never returns to the origin. 


In fact, when d = 1 or 2, the random walk visits almost surely every point 
of X d infinitely often. However, when d > 3, one has lim n _, 00 |5 n | = oo 
almost surely. The proofs of these further conclusions are outlined in 
Exercises 34 and 35. 

Proof. Let /i be the common distribution measure of each of the r n . 
Then " is the measure on concentrated at the points 土 ei, =be 2 ,... ， 
died, assigning measure l/(2d) to each of these points. Let be the 
distribution measure of s n . Like /i, the measure /i n is clearly supported 
on Z d . 

If 


A(0 = ^2 m ({ x : r n(^) = k})e~ 2niK 

k£Z d 

is the characteristic function of /i, and 

An(0 = ^ rn({x : 5 n (x) = fc})e - 2 峨 
kez d 


that of /i n , then An(0 — (A(0) n by the independence argument used 
previously several times. (See for instance (11).) Moreover, as is easily 
seen 

A(0 = \ (COS 27T^i + … .+ cos 20 . 
a 

However /i n (^), like is periodic with periods ei, e 2 ,... ， e^, and thus 
for each n 

(18) m{{x : 5 n (x) = 0}) = [ An(0^= [ (A(0) n 处， 

Jq Jq 

where Q is the fundamental cube defined by Q : ={C -1/2 < ^ < 1/2, 

j = 1 ， … ， d}. 

As a result of all of this we assert that, 

( 19 ) 5 n (x) = 0}) = f - • 

n=0 JQ 丄 MO 

Note first that /i(^) < 1, so the integrand on the right-hand side is always 
non-negative (or +oc). The claim is that both sides are simultaneously 
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infinite, or finite and equal. In fact multiplying both sides of (18) by r n , 
for 0 < r < 1, and summing gives 


r n m({x : 

n=0 


«n(^) = 0}) 


f 处 
q i - 


and letting r — 1 then yields (19). 
Now since 


1 — /i ⑹ =1 - - (cos 2n^i + … + cos 2 丌匕 ) 

a 

= -^l^| 2 + 0(|^| 4 ) as^O, 


and 1 — /i(0 > C\ if C 2 < |《|, ^ G Q, for suitable positive constants C\ and 
C 2 , we can conclude that the integral 



i - m 


diverges when d 二 1 or d = 2, but converges when d> 3. This means that 
Er=O m ({ X: S n( X )= : 0}) diverges or converges depending on whether 
d < 2 or d > 3. 

The above has the following interpretation. Let A n = {x : s n (x) = 0}, 
and XA n its characteristic function. Then #(x) = is the 

number of times the path x visits the origin. Thus f x #(x) dm is the 
expected number of times all paths visit the origin. However 


/ #(X) dm = : s n (x) = 0}) = 

」 X n=0 n=0 

so if d > 3 this expectation is finite, and therefore almost all paths return 
to the origin only a finite number of times, proving the first part of 
conclusion (b) of the theorem. 

While the expectation is infinite when d <2, this, by itself, does not 
show that almost all paths return infinitely often to the origin. That 
we will now see. To proceed we define to be the set of paths where 
5fc(x) = 0 for the first time 


Fk = {x : Sk(x) — 0, but 5^(x) 0 for 1 < ^ < k}. 

(Here we set F\ = 0.) Since the Fk are disjoint, Ylb=i ^ 1- We 

shall see that for d = 1 or 2 in fact YHi m (^k) = 1? which means that 
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almost all paths return to the origin at least once. This is in contrast 
with d >3 where < 1 which means that for a set of positive 

probability, the paths never return to the origin. 

We prove these assertions by showing first 
(20) rn(A n ) = ^ m(F k )m(A n - k ), for all n > 1. 

l<k<n 

In fact, A n = |J 1<fc<n (Ffc n A n )^ where this union is disjoint. Therefore 
m(A n ) = ^Z 1 ^ k ^ n m(F k n A n ). However 

Pi ^/4 几 = n {x : s n (x) _ Sfc(x) = Oj*. 

Hence m(Ffc D A n ) = m(Ffc)m({x : s n (x) — Sfc(x) = 0}) since the sets 
and {x : s n (x) — Sk(x) = 0} = {x : 以 ㈤ = 0} are clearly inde¬ 

pendent. However 


m({x : 5 n (x) - s k (x) = 0}) = m({x : s n _ fc (x) = 0}) = m(A n _ fc ), 

by the shift-invariance of the measure on the product space Z^. (We 
have already observed this kind of invariance in Section 2.1 in a different 
setting.) Thus m(i^ D A n ) = m{Fk)m{A n -k)^ giving us (20). 

If we set A(r) = r n m(A n ), F(r) = E 二 0 ” n m(F n )， 0 < r < 1 ， 

then (20) can be interpreted to say A(r) = A(r)F(r) + 1, that is F(r)= 

then A(r) — oo as r — 1, which gives F(l) 二 E 二 O 1, and 
proves that almost all paths return to the origin at least once. Sec¬ 
ondly, when d > 3, since the series m{A n ) converges, we deduce 

that F(l) : = EZim(F n ) 1 ， hence there is a set of positive probability 
where paths never return to the origin. 

For the case d < 2, to prove infinite recurrence we define for each £ > 1 
= {x : s n (x) = 0, but Sfc(x) = 0, ^ — 1 times, for 1 < A: < n). 

(Here we set = 0.) Note that = F n , and = ^ 

means that almost every path returns to the origin at least t times. 
Then by an argument very similar to that giving (20) we get when £ > 2 

m ( F fc £_1) ) m ( F n-fc)- 

l<k<n 

So if F^(r) is defined by r n m(Fn^), then 

F ⑷ (r 卜 WOF ⑴ (r )， 



.First, when d < 2, since the series ^ 。 m(A n ) diverges, 
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which by iteration yields F ⑷ (r) = (F^^(r))^. Letting r —> 1 then gives 

so almost all paths return to the origin at least t 
times. Since this holds for all i > l, conclusion (a) of the theorem is also 
proved. 

It is interesting to ask what happens to our random walk, when the 
time interval between successive steps is taken to be 1/n, the paths are 
re-scaled by a factor 1/n" 2 , and we then pass to the limit n — oc in 
accordance with the central limit theorem. The answer is that in this 
way we are led to Brownian motion. This important topic will be our 
next subject. 


3 Exercises 


1. Consider Sn(x) - 二 J2n=l r n( x )^ with N odd. 

(a) Calculate m({x : Sn(x) = k}), and show that as k varies over the integers, 
the maximum is attained at A; = — 1 and A; = 1. 

(b) Adapt the proof when N is even to show that for odd N, 


m({x : 


a < 


S N (x) 

N 1 / 2 


< 6 })— 



as N ^ oo. 


2. Find three functions J\, / 2 , and , 3 , so that any pair are mutually independent, 
but the three are not. 

Hint: Let fi = r\, f 2 = r 2 , and express in terms of ri and r 2 .] 

3. The collection {r n } of mutually independent functions on [0,1] cannot be much 
enlarged and still remain mutually independent. In fact, prove that if we adjoin 
a function / to the collection {r n }, then the resulting collection is also mutually 
independent only when / is constant. 

[Hint: See also Exercise 16.] 

4. Suppose fi and u are two finite measures on a space X that agree on a collection 
of sets C. If C contains X and is closed under finite intersections, then show that 
/x = ^ on the a-algebra generated by C. 

[Hint: The equality ii — u holds on finite unions of sets in C because 

k k 

MU^) = E MO n q)+ 

j = l jf —1 i<j 

k 

+ "(C*i n n C*£) + … + (—i) fc V ( 门 O.] 

i<j<£ J_ = l 
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5. Prove that the valued functions /i，...，/n are mutually independent if 
and only if their joint distribution measure equals the product of the individual 
distribution measures: 

= M/i x * * * x M/ n 35 measures on M nd = x ... x 
[Hint: Check the equality on cylinder sets in R nd and use the previous exercise.] 


6. Suppose {/ n } is a sequence of mutually independent functions on the probability 
space (X, m). Prove that there exists a probability space (X\ m!)^ with X’ an 
infinite product, X r = n 二 i (X n ,m n ) probability spaces, and m the product 
measure of the m n , so that the following holds: there are functions {g n } on X’ 
so that {f n } and {g n } have the same joint distributions, but each function g n 
depends only on the n th coordinate of X f . 

[Hint: Take (X n ,m m ) = (X, m) for each n and define g n in terms of the f n ac¬ 
cordingly, and use the previous exercise.] 

7. Show that if 5o,.. •, are mutually independent algebras, then for each k < n, 
the algebras Bj and B n are mutually independent. 

Prove this by noting the following. First, use induction to show that if Bj ^ J3j, 
then Bo n … fl 执 is independent of B n . Now fix B € 15m and consider the two 
finite measures fi(E) = m (五 Pi B) and y{E) = m(E)m(B) 1 and the collection C of 
sets that are of the form 五 =Bo n •.. fl Bfc, where Bj € Bj. Then apply Exercise 4. 

8. Verify the following further facts about probability distribution measures. 

(a) Suppose / = (/i, ■.. ， /fc) with each fj an valued function. Let /x be the 
probability distribution measure of /, and let L be a linear transformation 
of R dk to itself. Suppose that /x is the probability distribution measure of /. 
Then the distribution measure of L(f) is /xl, where iil{A) = fj / (L~ l A) for 
every Borel set A C R dk . 

(b) Suppose the distribution measure of fj is Gaussian with covariance matrix 
(y)L i < j < k. Assume also that the {fj} are mutually independent. Then 
the distribution measure of Ci/i + • • • + Ckfk is Gaussian with covariance 
matrix (cfaf + … + 

[Hint: For (b), compute the Fourier transforms (the characteristic functions) of 
the measures in question.] 

9. Consider the space L 2 (fi, R d ) of square-integrable R d -valued functions on the 

probability space A closed subspace Q of this space is called a Gaus¬ 

sian subspace if it is spanned by a sequence {f n } of mutually independent func¬ 
tions, each having a Gaussian distribution measure with mean zero, and covari¬ 
ance {a^/}. 

Prove that if Fi, F 2 ,..., Ffc are mutually orthogonal elements of Q, then they 
are mutually independent. Note that the converse of this is immediate. 
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[Hint: Consider the case when Q is finite-dimensional, and Q is spanned by /i,..., /；v 
One may suppose, after multiplication by appropriate scalars, that the fj and Fj 
each have L 2 norm equal to 1. So there exists an orthogonal linear transformation 
L so that L(fj) = Fj. Then apply Exercises 5 and 8.] 

10. Consider the following two types of convergence of a sequence {/ n } to a limit 
/ on a probability space: 


⑴ /n 
(ii) fn 


almost everywhere, 

in terms of weak convergence of their distribution measures. 


Prove that (i) implies (ii), but that this implication cannot be reversed. 

[Hint: Recall that if ip is continuous and bounded, then J (f(f) dfi = J 9 dfif ， where 
fif is the distribution measure of /, and apply the dominated convergence theorem.] 


11. On [0,1] with Lebesgue measure, construct a function / whose distribution 
measure is normal. 

2 

[Hint: Consider the “error function” Erf(x) = e~ l I 2 dt and its inverse 

function.! 


12. Prove the identity (6), which says that if G is a non-negative continuous 
function on M (or continuous and bounded), and / is a real-valued measurable 
function (on a probability space (X, m)) with distribution measure /x = /x/, then 


G(f)(x) dm = / G(t) 


x 


R 


[Hint: Note that if / is bounded, then Ylk GW n ) 爪 /n < f < (k -\- l)/n}) con¬ 
verges to both integrals as n —> 00 .] 

13. The Rademacher sequence {r n } is far from complete on L 2 ([0,1]). In fact it 
cannot be completed by adjoining any finite collection of functions. Prove this in 
two ways. 

(a) By considering the functions {r n r m } for n < m. 

(b) By using the L p inequality of Lemma 1.8. 

See also Exercise 16. 


14. Consider the power series 


±a n z n = ^2 T n (t)a n z n = F(z,t), 


where \ a n 


2 


and limsup \a n \ l ^ n < \ 
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Show that for almost every t, the function F(z, t) cannot be analytically con¬ 
tinued outside the unit disc. 

[Hint: Argue as in Theorem 1.7 part (b) using Abel summation rather than Cesaro 
summation.] 

15. Show that the L 2 ([0,1]) span of {r n } can be characterized as the subspace 
of L 2 consisting of those / for which 


E N (f) = S N (f), for all TV, 

where E；v are the conditional expectations corresponding to the dyadic intervals 
of length 2 ~ N . 


16. A natural completion of the collection of Rademacher functions are the Walsh- 
Paley functions. One defines this collection on [0,1]，denoted by {t(； n }，in the 
following way. 

First one sets wo(t) = 1, wi(t) — r\(t), W 2 (t) = r 2 ⑷ and ws(t) = ri ⑴ r 2 ⑴. 
More generally, if n > 1, n = 2 fcl + 2 fc2 + • • • + 2 ke , where 0 < /ci < • • • < then 
one defines 

e 

扣 n(() = 11 f'kj + l (^)* 

i=i 

In particular w 2 k-i = r*：. 

(a) Prove that {t( ； n }^L 0 is a complete orthonormal system on L 2 ([0,1]). 

(b) Verify the following additional interesting property of the Walsh-Paley func¬ 
tions: they are the continuous characters of the compact abelian group 
(thought of as the product of the two-point abelian groups Z 2 ). 

[Hint: Equip the group Z 2 0 with the addition x -\- y defined by (x 4 - y)j = Xj + yj 
mod 2 ii x = (xj) and y = (yj). Then Tk{x y) = rk{x)rk{y)^ 

Consider also the “Dirichlet kernel” Kn(^) = Ylk=o and show that if 

N = 2 n , then K^it) = n?=i(l + r j( 亡 )），hence (t) = 2 n ii 0 < t < 2— n and 0 
otherwise. As a result, using the convolution f f (jj)Kn (x + y) dy ， note that if 
/ 〜 then ^2 k<2 n CLkWk — E n (/), where E n was defined in Section 1.6. 
See also Problem 2*.] 


17 . The inequality in Lemma 1.8 may be strengthened as follows. Let F(t)= 
Y^n=i a nr n {t), with a n real and X^° =1 ^ = 1. Then 

(a) dt < 2〆， for all 0 < /x. 

1 2 

(b) As a result, for some c > 0， /◦ e c l F ⑴ I di < 00 . 

[Hint: Part (a) implies that m({t : |F ⑴ | > a}) < 2〆 2 一 Choose /x = a/2, and 
obtain (b) with c < 1/4.1 ~ 
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18. Prove that there exists an / in L p ([0,27r]), for all p < oo, / ~ Yl°^oo c ne tn0 , 
so that \ c n\ q — °°， for all q < 2 . Hence the HausdorfF-Young inequality in Sec¬ 
tion 2.1 of Chapter 2 fails for p > 2. 

[Hint: Use Theorem 1.7.] 

19. Suppose F(t) = Y^^=i a nr n {t) with < oc. 

(a) Prove directly that there exists a constant A so that 

||F|| L 4 < A||F|| L 2. 

(b) Show as a result that there is a constant A! so that ||^||^2 < 

(c) Conclude that ||F||lp < A p ||F|| l i , for 1 < p < oo. 

[Hint: For (a) write out F 4 (t) dt as a sum and use the orthogonality of r n (t)r m (t). 
For (b) use Holder’s inequality. For (c) use Lemma 1.8.] 

20. Suppose {A n } is a sequence of subsets of the probability space X. 

(a) If E m(A n ) < oo, then m(lim sup n ^ oc A n ) = 0, where lim sup n _ >oc A n is de- 
fined as f|^Li U^Ln 

(b) However if ^m{A n ) = oo, and the sets {^n} are mutually independent, 
then m(limsup n _ >00 A n ) = 1. 

This dichotomy is often referred to as the Borel-Cantelli lemma. (See also Book III.) 
[Hint: In the case (b), m (f|Lr A k) = EIL〆 1 - 爪 ⑷)).] 

21 . Except for a countable set (the dyadic rationals) it is possible to assign a 
unique dyadic expansion to each real number a in [0,1], that is, 

OO 

a = — with Xj = 0 or 1. 

j=i 


Given such a number a let # ； v(a) denote the number of l’s that appear among 
the first N terms in the dyadic expansion of a. We say that a is normal, if its 
dyadic expansion contains a density of l’s equal to the density of 0’s, that is, 

lim = 1/2. 

N—oo N 


(a) Prove that (with respect to the Lebesgue measure) almost every number in 
[0,1] is normal. 

(b) More generally, given an integer q > 2 consider the g-expansion of a real 
number a in [0,1], 


( 21 ) 


OC 


x j 

q j 


with Xj = 0, 1,... ， g _ 1. 
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Again, ignoring a countable subset, this expansion is unique. For a given real 
number a and for each 0 < p < q — define # p ,n(o ：) to equal the number 
of j's with 0 < j < N for which Xj = p in the g-expansion of a. A number 
that satisfies 

lim = i /(? , 

N-^oo N 

for every 0<p<g—lis said to be normal to base q. 

Show that almost all real numbers in [0,1] satisfy this property. 

[Hint: Consider the infinite product with each factor given the uniform mea¬ 
sure. Under (21) the product measure corresponds to the Lebesgue measure on 
[0,1]. Now apply the law of large numbers as in Theorem 2.1.] 

22. A sequence {/ n }^L 0 of functions on X is called a (discrete) stationary 
process if for every N the joint probability distribution of / r , / r +i, … ， f r +N is 
independent of r. 

Consider the probability space Y constructed as in the proof of Theorem 2.1. 
Show that whenever the sequence {/ n } is a stationary process, then it has the same 
joint distribution as the sequence {po(T n (?/))}, where go is a suitable function on Y 
and r is the shift. Hence the ergodic theorem is equally applicable in this more 
general situation. 

23. Prove that the conditions in Theorem 2.1 are sharp in the following sense. If 
{fn}^Lo are mutually independent and identically distributed, but f x \fo(x) \ dm = 

oo, then for almost every the averages ^2^=o f n ( x ) fail to converge to a limit 
as N — oo. 

[Hint: Let A n = = 忙： \MX)\ 〉 Ti} • he sets ^ 4 . 77 , 3ifG lndGpGndGnt • Howgvgt ^ 
m(An) = m {{ x : |/o(^)| > n}) « f x \fo{x)\dm = oo. Then use Ex¬ 

ercise 20.] 

24. The following are examples of conditional expectations. 

(a) Suppose X = (J A n is a finite (or countable) partition of X, with m(An) > 0 

whenever A n is non-empty. Let A be the algebra generated by the sets {A n }. 
Then E A (f)(x) = f An f dm whenever x G A n . 

(b) Let X = X\ x X 2 , with the measure m on X being the product of the 
measures rrn on X{. Let A = {A x X 2 }, where A ranges over arbitrary 
measurable sets of X\. Then E^(/)(o ： i, 0 ^ 2 ) = f X2 f(x\,y) dm ， 2 (y). 


25. In the following four exercises {s n } will denote a martingale sequence corre¬ 
sponding to the increasing sequence of algebras An and their conditional expecta¬ 
tions E n - 

Suppose s n = En(Soo) with 5oo € L 2 . Then {s n } converges in L 2 . 

[Hint: Note that if f n = s n — s n -i then the f n ’s are mutually orthogonal and 

S n — Sq = /fc.] 
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26. Prove the following. 

(a) If Soo ^ L p , then s n = E n (5oo) € L p , and ||5 n ||LP < ||5 oo||lp for all p with 

1 < p < oo. 

(b) Conversely, if {s n } is martingale and sup n ||5 n ||LP < oo, then there exists 
5oc € L p , so that s n = E n (5oo), when 1 < p < oo. 

(c) Show, however, that the conclusion in (b) may fail when p = 1 . 

[Hint: For (a) argue as in the proof of Lemma 2.5(a). For (b), use Lemma 2.5 and 
also the weak compactness of L p , p > 1, as in Exercise 12 in Chapter 1. For (c), 
let X = [0,1] with Lebesgue measure, and consider 5 n (^) — 2 n for 0 < x < 2 -n , 
s n (x)= : 0 otherwise.] 

27. Suppose that s n = E n (Soo)，with Soo integrable on X. 

(a) Show that s n converges in the L 1 norm as n —> oo. 

(b) Moreover s n —> 5oo in L 1 if and only if 5oo is measurable with respect to the 
algebra joo = V 二 i 人 . 

[Hint: For (a) use Exercises 25 and 26 (a). Then lim s n = (5oo), and use the 

previous exercise.] 

28. Suppose that s n = E n (Soo)，and Soo ^ L 1 . 

(a) Show that 

m({x : sup |5 n (x)| > a}) < — / \soo(x)\dx. 

n a J |Soo(3 ： )|>Q ： 

(b) Prove as a result || sup n |5 n |||LP < ^4 p ||5oo||lp if 5oo € L v and 1 < p < oo. 

[Hint: For (a), note that when Soo > 0 this is a consequence of (15). To deduce (b) 
adapt the argument in the proof of Theorem 4.1 in Chapter 2 for the maximal 
function /*.] 

29. The results for real-valued martingale sequences {s n } discussed in Section 2.2 
go through if we assume that the s n take their values in U d . Verify in particular 
that the following consequences of identity (14) hold: 

(a) \sk\ < Efc(|5 n |), if fc < n. 

(b) m({x : sup n |s n (:r)| > a}) < S/ |sqo ⑷ |>a |soo ⑻ |cb. 

Here I . I denotes the Euclidean norm in R d . 

[Hint: To prove (a), note that ( 外， v ) 二 Efc((s n ， v))，where (•，.）is the inner product 
on R d , and v is any fixed vector in R d . Then take the supremum over unit vectors v. 
The conclusion (b) is a consequence of (a) and part (a) of Exercise 28.] 
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30. The ideas regarding conditional expectations extend to spaces (X ， m) whose 
total measures are not necessarily finite. Consider the following example: X = 
with m the Lebesgue measure. For each n E Z, let A n be the algebra generated by 
all dyadic cubes of side-length 2 _n . The dyadic cubes are the open cubes, whose 
vertices are points of 2 _7I Z d , and have side-length 2—' Clearly A n C A n +i for 
all n. Let / be integrable on M d , and set E n (/) = E^ n (/), with 

_ 1 
_ rn(Q) 

whenever x E Q, with Q a dyadic cube of length 2~ n . 

(a) Show that the maximal inequality in Theorem 2.10 extends to this case. 

(b) If / > 0, then sup neZ E n (f)(x) < cf*(x) for an appropriate constant c, with / 
the Hardy-Littlewood maximal function discussed in Chapter 2. 

(c) Show by example that the converse inequality f*(x) < d sup n€Z E n (f)(x) 
fails. Prove however that a substitute result holds 

m({x : f*(x) > a}) < Cim({x : supE n (/)(x) > C 2 a}) 
for all a > 0. Here C\ and C 2 are appropriate constants. 




31. Let {/xn}n=i and v be probability measures on R d . Prove the following are 
equivalent as TV —> oo. 

(a) An(0 — 以 0, all i E R d . 

(b) v weakly. 

(c) In M, —> iv((a, b)) for all open intervals (a, 6), if we assume the 

measure u is continuous. 

(d) In M d , y{0) for all open sets (9, if we assume the measure v is 

absolutely continuous with respect to Lebesgue measure. 


[Hint: In M, the equivalence of (a), (b) and (c) is implicit in the argument given 
in the proofs of Lemma 2.15 and Corollary 2.16. To show that (a) implies (d) in 
the case when O is an open cube, generalize the argument given in the text to 


Then, prove that the analog of (d) holds for closed cubes. Finally, use the fact 
that any open set is an almost disjoint union of closed cubes. To show that (d) 
implies (b), approximate a continuous function (p of compact support uniformly 
by step functions that are constant on cubes.] 

32. The proof of Theorem 2.17 requires the following calculation. Suppose cr is a 
strictly positive definite symmetric matrix with cr 一 1 denoting its inverse. Let 

i - _ I a — 1 (x) |2 , 

be the measure on R with density equal to ^ir^T^idet a) e ' ? x G R . Then 



⑹ = e -2-V(oi 2 . 
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[Hint: Verify this by making an orthogonal change of variables that puts cr in a 
diagonal form. This reduces the d-dimensional integral in question to a product of 
corresponding 1-dimensional integrals.] 

33. For the d-dimensional random walk {s n (: r)} considered in Section 2.6, find 
the limit of the distribution measures of s n (x)/n 1 ^ 2 as n — oo. 

34. If A; is a lattice point in and d = 1 or 2, show that for almost every path, 
the random walk visits k infinitely often, that is, 

m({x : s n (x) = k for infinitely many n} = 1. 

[Hint: There exists £o so that m({ 々 0 = —k}) > 0. If the conclusion fails, then 
there exists r。so that m({5 n ^ k, for all n > ro}) > 0. Then note that 

{s n ^ 0, all n > 4 + r 0 } D {se 0 = —fc} n {s n - # A;，all n>£ 0 - {- r 0 }， 

and that the sets on the right-hand side are independent.] 


35. Prove that if d > 3, then the random walk s n satisfies lirrin—oo |5 n | = oo almost 
everywhere. 

[Hint: It is sufficient to prove that for any fixed R> 0 the set 

B = {x : lim inf |5 n (^)| < R} 

n ― ►oo 

has measure 0. To this end, for each lattice point k, define 

B(k, i) = {x : se(x) = k, and s n (x) = k for infinitely many n}. 

Clearly, B c\J e 叫 <R B(k,£). But d > 3, so m(B(k, £)) : = 0).] 


4 Problems 


1. In the context of Bernoulli trials with probabilities 0 < p,q < 1, where p q = 
1, let D : Z§° ^ [0,1], be given by 

OO 

D (^cc ) — 〉 : Xri j 2 if oo -- (工 i ,... ， 3^ ， ... ) . 


Under this mapping the measure m p goes to the measure fi p that can be written 
symbolically as a “Riesz product,” /x p = + (p — q)r n {t)) dt. The meaning 

of this is as follows. For each iV，let 

n 

F N (t)= n(i + (p - g)r n (s)) ds. 

Jo — , 


Then one can show that: 
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(a) Each Fn is increasing on [0,1]. 

(b) iM0)=0 ， iMl) = l. 

(c) Fn converges uniformly to a function F，as iV —> oo. 

(d) fip = dF, in the sense that /x p ((a, b)) = F(b) — F(a). 

(e) If p ^ 1/2, then fi p is completely singular (that is dF/dt = 0 almost every¬ 
where.) 

[Hint: Show that if / = (a,6) is a dyadic interval of length 2 -n , a — "2 n and 
b — [t-\- l)/2 n , and N > n, then 

FN(b)-F N (a)W ， 

where no is the number of zeroes in the first n terms of the dyadic expansion of 
£/2 n , and rii is the number of l’s, with no + ni = n.] 

2* There is an analogy between the Walsh-Paley expansion (see Exercise 16) and 
the Fourier expansion, that is, between {ty n }^Lo and In this anal¬ 

ogy the Rademacher functions rk = w 2 k-i correspond to the lacunary frequencies 

{e l2 〜 } 匕 0 . In fact, the following is known: 

⑷ if Er= 0 Cfce l2fc ^ is an L 2 ([0, 2n]) function, then it belongs to L p , for every 
p < oo. 

(b) If Cfce* 2 、is the Fourier series of an integrable function, it belongs to 

L 2 , and hence to L p for every p < oo. 

(c) This function belongs to L°° if and only if |ca ：| < oo. 

(d) Prom (c) it follows that the conclusion (a) of Theorem 1.7 does not neces¬ 
sarily extend to p = oo. 


3. The following is a general form of the central limit theorem. Suppose /i,, / n ， . 
are square integrable mutually independent functions on X, and assume for sim¬ 
plicity that each has mean equal to 0. Let fi n be the distribution measure of / n , 
and the variance. Set cr^. The critical assumption is that for every 

€ > 0 


lim 


~si 



t 2 dfJ ， n(t) = 0. 


Under these conditions the distribution measures of 是 A converge weakly 

to the normal distribution v with variance 1. 


4.* Suppose {/n} are identically distributed, square integrable, mutually indepen¬ 
dent, have each mean 0 and variance 1. Let s n = fk. Then for a.e. x 


lim sup 


⑻ 


(2n log logn) 1 / 2 
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This is the law of the “iterated logarithm •” 

5. An interesting variant of the random walk in R d (often referred to as a “random 
flight”) arises if the motion of unit distance at the n th step is allowed to be in any 
direction (of the unit sphere). More precisely 

Sn — fl + . ’ . + fn 5 

where the f n are mutually independent, and each f n is uniformly distributed on 
the unit sphere C R. d . The underlying probability space is defined as the infi¬ 
nite product X = Sj, where each Sj = 6^— 1 with the usual surface measure 

normalized to have integral 1 . 

(a) If fi is the distribution measure of each / n , connect ⑹ with Bessel func¬ 
tions. 

(b) What is the covariance matrix? 

(c) What is the limiting distribution of s n (x)/n 1 ^ 2 ? 

[Hint: Show that fl(^) = r(d/2)(7r|《|)( 2 — d )〆 2 J( d — 2)/2(27r|{|) by using the formulas 
in Problem 2 , Chapter 6 of Book I.] 



An Introduction to Brownian 
Motion 


Norbert Wiener: a precocious genius... whose feeling 
for physics and appreciation of Lebesgue integration 
was so deep that he was the first to understand the 
necessity of and the proper context for a rigorous def¬ 
inition of Brownian motion, which he then devised, 
going on to initiate the fundamentally important the¬ 
ory of stochastic integrals; who, however, was so unfa¬ 
miliar with the standard probability techniques even 
at elementary levels that his methods were so clum¬ 
sily indirect that some of his own doctoral students 
did not realize that his Brownian motion process had 
independent increments; who was the first to offer a 
general definition of potential theoretic capacity; who, 
however, published his probabilistic and potential the¬ 
oretic triumphs in a little-known journal, with the re¬ 
sult that this work remained unknown until too late 
to have its deserved influence... 


J. L. Doob, 1992 


Between the 19 th and 20 th centuries there was a change in the scientific 
view of the natural world. The belief in the ultimate regularity and pre¬ 
dictability of nature gave way to the recognition of a degree of inherent 
irregularity, uncertainty, and randomness. No mathematical construct 
better encompasses this idea of randomness, nor has wider general inter¬ 
est, than the process of Brownian motion. 

While there are different ways of constructing the Brownian motion 
process, the approach we have chosen attempts to see the Brownian paths 
in R d as limits of random walk paths, appropriately rescaled. The ana¬ 
lytic problem that then must be dealt with is the question of convergence 
of the measures induced by these random walks to the “Wiener measure” 
on the space V of paths. 

A remarkable application of Brownian motion is to the solution of 
Dirichlefs problem in a general setting. 1 It is based on the following 


iSee also the previous discussion for the disc in terms of Fourier series in Book I, in 
relation to conformal mappings in Book II, and the use of Dirichlet’s principle in Book III. 
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insight that goes back to Kakutani. Namely, whenever 尺 is a bounded 
region in x a fixed point in it, and E a subset of dTZ, then the 
probability that a Brownian path starting at x exits TZ at E, is the 
“harmonic measure 15 of E with respect to x. 

A key to understanding this approach is the notion of a “stopping 
time.” The basic example here is the first time that the path starting 
at x hits the boundary. Incidentally, stopping times were already used 
implicitly in the proof of the martingale maximal theorem in the previous 
chapter. 

One also needs to come to grips with the “strong Markov” property of 
Brownian motion, which essentially states that if the Brownian motion 
process is restarted after a stopping time, the result is an equivalent 
Brownian motion. The application of this Markov property is a little 
intricate, and it is best understood in terms of an identity that involves 
two stopping times. 


1 The Framework 


Here we begin by sketching the framework of our construction of Brown¬ 
ian motion. At first we describe the situation somewhat imprecisely, and 
postpone to Sections 2 and 3 below the exact definitions and statements. 

We recall the random walk in studied in the previous chapter (in 
Section 2.6). It is given by a sequence where 

n 

= ^n(^) = 〉 ] tfc(X )， 
k=l 


with s n (x) G Z d for each x in the probability space Z 益 . This probability 
space carries the probability measure m, which is the product measure 
on Z 益 .In this random walk we visit points in Z d moving from a point 
to one of its neighbors in steps of unit “time” and “distance.” 

Next we consider the rectilinear paths obtained by joining these suc¬ 
cessive points, and then rescale both time and distance, so that between 
two consecutive steps the elapsed time is l/_/V and the traversed distance 
is 1/iV 1 / 2 , all in accordance with our experience with the central limit 
theorem. That is, for each N we consider 

⑴ ^(x) = ^ E + ⑻. 

l<k<[Nt] 

Now for each N 、 5((^) is a stochastic process, that is, for 0 < ^ < oo, 
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is a function (random variable) on a fixed probability space (here, 
(项 ，爪 ))• 

Our goal is the proper formulation and proof of the assertion that we 
have the convergence 

(2) Sr — B t as N 一 oo, 
where B t is the Brownian motion process in R d . 

So to proceed we need first to set down the properties that characterize 
this process. Brownian motion B t is defined in terms of a probability 
space (fi, P), with P its probability measure and u denoting a typical 
point in fi. We suppose that for each 0 < t < oo, the function B t is 
defined on and takes values in R d . The Brownian motion process 
B t = B t (oo) is then assumed to satisfy Bq(u) = 0 almost everywhere and: 

B-l The increments are independent, that is, if 0 < ti < < • < tfc ， 

then B tl , B t2 — B tl , … ， B tk — B tk _ l are mutually independent. 

B-2 The increments B t+ h — B t are Gaussian with covariance hi and 
mean zero, 2 for each 0 < t < oo. Here I is the d x d identity matrix. 

B-3 For almost every cj G fi, the path 1B t (uj) is continuous for 0 < 

t < oo. 

Note that in particular, B t is normally distributed with mean zero and 
covariance tl. 

Now it will turn out that this process can be realized in a canonical way 
in terms of a natural choice of the probability space Q. This probability 
space, denoted by V, is the space of continuous paths in R d starting at 
the origin: it consists of the continuous functions 1 p ⑴ from [0, oc) 
to with p(0) = 0. 

Since, by assumption B-3, for almost every a; G the function 1 
B t (uj) is such a continuous path, we get an inclusion i : Q ^ V and then 
the probability measure P gives us, as we will see, a corresponding mea¬ 
sure W (the “Wiener measure^) on V. 3 

One can in fact reverse the logic of these implications, starting with 
the space V and a probability measure W given on it. From this, one 
can define a process B t on V with 

(3) B t (p) = p(t). 

2 In the notation of the previous chapter the increments have distribution 
3 More precisely, the inclusion i is defined on a subset of of full measure. 
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We then say that the measure on P is a Wiener measure if the 
process B t defined by (3) satisfies the properties of Brownian motion set 
down in B-l, B-2 and B-3 above. Thus the existence of a Wiener measure 
is tantamount to the existence of Brownian motion. In fact, we will focus 
on constructing a Wiener measure and then relabel B t and designate it 
by B t , Moreover we will see that such a Wiener measure on V is unique, 
and so we speak of “the” Wiener measure. 

Now returning to the random walks and their scalings given in (1), we 
have defined for each x G a continuous path t h->> S^ N \x) defined for 
0 < t < oo. Thus the probability measure m on Z 益 induces a probability 
measure on V via 

_(A) = m({x G Z^ d : S^ N) (x) e A}), 

where A is any Borel subset of paths in V. With this, our goal is the 
following assertion: 

The measures converge weakly to the Wiener measure W 
as N — oo. 

Notice that it is not claimed that the convergence in (2) is anything like 
pointwise almost everywhere, but only a statement essentially weaker in 
appearance in terms of convergence of induced measures. 4 


2 Technical Preliminaries 


With V denoting the collection of continuous paths t p(t) from [0, oo) 
to M d such that p(0) = 0, we endow V with a metric with respect to which 
convergence is equivalent to uniform convergence on compact subsets of 
[0, oc). 

For two such paths, p and p’ in P, we set 

dn(p,P ， )= sup |p ⑴ -p’ ⑴ I ， 

0<t<n 


and 


d(p ， p’）= 


1 Sn(P ， P ’） 
2n i + ^n(p, pO 

丄 


Then it is easily verified that d is a metric on V. We record here some 
simple properties of d, whose proofs may be left to the reader: 


4 Since S [ N 、and Bt are defined on different probability spaces, pointwise almost ev¬ 
erywhere convergence would not be meaningful. It is also to be noted that the rectilinear 

(J\f) 

Paths corresponding to } are a subset of zero measure of V. 
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We have d(Pfc, p) —> 0, as fc ^ oo, if and only if ^ p uniformly 
on compact subsets of [0, oo). 


• The space V is complete with respect to the metric d. 


• V is separable. 


(See also Exercise 2.) 

We next consider the Borel sets B of V, defined as the cr-algebra 
of subsets of V generated by the open sets. Since V is separable, the 
cr-algebra B is the same as the cr-algebra generated by the open balls 


in V. 

A useful class of elementary sets in B is that of the cylindrical sets, 
defined as follows. For each sequence 0 < ti < ^2 < * * • < and a Borel 
set A in M dk 二 x .. • x (that is, k factors R d ), then 

{p G P : (p(ti), p ⑹， … ， p(4)) ^ ^4} 


is called a cylindrical set. 5 We denote by C the cr-algebra of V generated 
by these sets (as k ranges over all positive integers and A over all Borel 
sets in R dk ). 


Lemma 2.1 The a-algebra C is the same as the cr-algebra 13 of Borel 
sets. 

Proof. If O is an open set in then clearly 

{pep: (p(ti), p ⑹ ，…， p(t k )) e O} 

is open in P, and hence this set belongs to B. As a result, cylindrical 
sets are in B, thus C <Z B. 

To see the reverse inclusion, note that for any fixed n and a and a given 
path po, the set {p G P ： sup 0<t<n |p ⑴一 po ⑴ | S a} is the same as the 
corresponding set where the supremum is restricted to the t in [0, n] that 
are rational, and hence this set is in C. It is then not too difficult to see 
that for any 5 > 0, the ball {p G P ： d(p, Po) < is in C. Since open 
balls generate B we have B cC^ and the lemma is established. 

We will now consider probability measures on V, and in what follows 
these will always be assumed to be Borel measures, that is, defined 
on the Borel subsets B of V. For any such measure /i, and any choice 


5 This terminology is used to distinguish it from “cylinder sets” that appear in product 
spaces. 
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0 < < ^2 < * * * < we define the section " t2 ，...， tfc ) of " to be the 

measure on R dk given by 

(4) ( 乂 ） = ^({p g V : (p(^i), p(h )， …， p(4)) ^ A}) 
for any Borel set A in R dk . 

It follows from Lemma 2.1 and Exercise 4 in the previous chapter that 
two measures /i and u on V are identical if f or 

all 0 < ti < ^2 < * • * < since they then agree on all cylindrical sets 
(and the intersection of two cylindrical set is also a cylindrical set). The 
converse, that if p = i / then all their sections agree, is obviously true. 

We will be concerned with a sequence { 卽 } of measures on V, and 
the question whether this sequence converges weakly, that is, whether 
there exists another probability measure fx so that 

(5) / f dpN — I f dfx as iV — oo, for every / G Cb[V). 

Jv Jv 

Here Cb{V) denotes the continuous bounded functions on V. 

A particular feature of our metric space V that does not allow certain 
compactness arguments to apply in regard to (5) is that V is not a- 
compact. (See Exercise 3.) This is the reason for the significance of the 
following lemma of Prokhorov. 

Suppose X is a metric space. Assume that { 卽 } is a sequence of 
probability measures on X, and that this sequence is tight in the sense 
that for each 6 > 0, there is a compact set K e G X so that 

(6) < 6, for all N. 


In other words, the measures 卽 assign a probability of at least 1 — 6 to 
K e for all N. 

Lemma 2.2 If {⑽} is tight, then there is a subsequence {f^N k } that 
converges weakly to a probability measure fi on X. 

Proof. For each compact set Ki/ m arising in (6) with 6 = 1/m, we 
construct a countable collection of functions C Cb(X) so that: 

(i) The functions g\K 1/rn ? with g G are dense in 

(ii) sup xeX \g(x)\ = sup xeKi/m \g(x)l if g G V m . 
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The T> m can be obtained as follows. Since Ki^ is compact, and 

C(^i/m) are both separable. (See Exercise 4.) Now if {g f e } is a count¬ 
able dense subset of C(Ki/ m ), we can extend each g f e defined on Kx/m 
to a function gi defined on X by the Tietze extension principle. (See 
Exercise 5.) The resulting collection of functions is taken to be V m . 

Now since V = IJ==i is a countable collection of functions in 
Cb(X), we can use the usual diagonalization procedure to find a sub¬ 
sequence of the measures {"at}, which we relabel as { 卽 }, so that 


— J 9 

converges to a limit as W ^ oc, for each g £ T>. 
Next we fix / G Ct(X), and write 


PN(f) = _(f — 分 ） + 卽⑷ • 


Now given any m we can find g G P m , so that |(/ — g)(x)\ < 1/m if 
x G Ki/jn ， Therefore, with || • || denoting the sup-norm on X, we have 


\^nU -g)\ < 


\f-g\dfi N 


\f - g\dfi N 




K 


i/i 


~m + m U ~ 9 


4 + + 


m 


where we have used (ii) above. From this it is clear that 


limsup"Ar(/) — liminf"Ar(/) = 0(l/m), 

N-^oo N-^oo 

and since m was arbitrary, the conclusion is that lim^v-^oo "at(/) exists. 
This defines a linear functional i on Cb(X) by 

Jim 

N—oo 

Now we note that £ satisfies the requirements of Theorem 7.4 in Chap¬ 
ter 1. In fact, given e > 0, if we choose K e as in the definition of tightness, 
then 


|^(/)| < / |/| dfx N + / |/| dfx N , 

Jk € Jki 
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so the inequality (6) implies 


\^N(f)\ < SUp |/(x)| + 6||/||, 
xeK € 


and thus the same estimate holds for i(f), satisfying the hypothesis (21) 
of the relevant theorem in Chapter 1. This yields that the linear func¬ 
tional t is representable by a measure /i, and since we then have 卽 （/)— 
fi(f) for all / G Cb(X), we see that ⑽ p weakly. 

Corollary 2.3 Suppose the sequence of probability measures {"at} is 

tight, and for each 0 < ti < t 2 < • •' < the measures " 义 1 ’…’“ ） con¬ 
verge weakly to a measure ，.” tk ，as N — oo. Then the sequence {/Mr} 
converges weakly to a measure and moreover "(*i ， …’ k) = ^t 1 ,...,t k . 

Proof. First, by Lemma 2.2, there is a subsequence {^N m } that 
converges weakly to a measure u. Next, "(j 1 ’ …， — "(h ， … ， wea kly. 

八 m 

In fact, if 丌 *1 ， *2,…， y the continuous mapping from V to R kd that as¬ 
signs to p G P the point (p(h) ， p(, 2 )，• • • ， p(^)) ^ then, by defini¬ 
tion, = /i((7r tl, -- ,tfc ) _1 yl) for any Borel set A C As a 

result 

I f df / tu …， tk ) 二 f (f on tu …， tk )dfi 

JRd k Jv 

for any / G C^(lR dfc ), with a similar identity with fx replaced by "Ar m . 
From this, and the weak convergence of 卽爪 to ", it follows that "(*i’= 

We now observe that the full sequence {^n} must converge weakly 
to /i. Suppose the contrary. Then there is another sequence 卽二 and a 
bounded continuous function / on V, so that f f converges to a 
limit that is not equal to f f dfi. Now using Lemma 2.2 again, there is 
a further subsequence {"at"} and a measure u, so that converges 
weakly to u, while v _ 卜 However by the previous argument we have 
u {ti,...,t k ) _ "Gi ， … ， G) for all choices of 0 < < ^2 ^ * * * ^ ^- Therefore 

"=and J f dfi = f f dv. This contradiction completes the proof of 
the corollary. 

In applying the lemma and its corollary it will be necessary to prove 
that appropriate subsets K of the path space V are compact. The fol¬ 
lowing gives a sufficient condition for this when K is closed. (It can be 
shown to be necessary. See Exercise 6.) 
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Lemma 2.4 A closed set K dV is compact if for each positive T there 
is a positive bounded function h ^ wxih), defined for h G (0,1] with 
WT(h) —0 as h — 0 ， and so that 

(7) sup sup |p(t h) — p(t)| < wr(h), for h G (0,1]. 

peK 0<t<T 

The condition (7) implies that the functions on K are equicontinuous on 
each interval [0, T]. The lemma then essentially follows from the Arzela- 
Ascoli criterion. (Recall, this criterion was used in a special setting in 
Section 3, Chapter 8 of Book II.) 

3 Construction of Brownian motion 

We now prove the existence of the probability measure W on V that 
satisfies the following: If we define the process B t on the probability 
space (V, W) by 

B t (p) = P ⑴， for pGV, 

then B t verifies the defining properties B-l, B-2 and B-3 of Brownian mo¬ 
tion set down at the beginning of this chapter (with (V, W) playing the 
role of (fi, P)). Note that if we are assured of the existence of such a W, 
the measure is the distribution measure of (B tl ,..., B tk ). 

Therefore, by Exercise 8 in Chapter 5, this distribution measure is de¬ 
termined by properties B-l and B-2, hence with this data the Wiener 
measure W is uniquely determined, as in the remarks following the proof 
of Lemma 2.1. 

To construct W we return to the random walk {s n } discussed at the 
beginning of this chapter, with its attached probability space (Z 益， m). 

Now for each x G there is a path 1S^ N \x) given by (1). This 
gives an injection : Z 益一 > P. If Vn denotes the image of ijsi (the 
collection of random walk paths scaled by the factor iV -1 〆 2 ) then Vn 
is clearly a closed subset of V. Now via 〜，the product measure m on 
Z 益 induces a Borel probability measure _ on P, which is supported on 
Vn, by the identity = m ( 心 1 (d fi 7^)). (Note that i^{A DPjv) 

is a cylinder set in the product space Z 益 whenever ^4 is a cylindrical set 
in V.) 

Theorem 3.1 The measures ㈣ onV converge weakly to a measure as 
N —> oo. This limit is the Wiener measure W. 
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There are two steps in the proof. The first, that the sequence 卽 satisfies 
the tightness condition, is a little intricate. The second, that then ㈣ 
converges to the Wiener measure, is more direct. The second step is 
based on the central limit theorem. 

For the first step, the following lemma is key. It is a consequence of 
the martingale properties of sums of independent random variables dealt 
with in the previous chapter. Consider the unsealed random walk 

«s n (x) = ^2 r fc( x )■ 

l<fc<n 

This is in (1) with N = 1 and t = n. 

Lemma 3.2 We have as X — oo ; 

(8) sup m({x : sup |5fc(x)| > An 1 ’ 2 }) = 0(X~ P ) 

n>l k<n 

for every p>2. 

Remark. In the first application below it suffices to have the conclusion 
for some p such that p > 2. 

To prove the lemma we apply the martingale maximal theorem of the 
previous chapter (Theorem 2.10, in the form that it takes in Exercise 29, 
part (b)) to the stopped sequence { 5 ^} defined as s f k = 5^ if fc < n, s r k = 
5 n if fc > n, and = s n . With 5 * = sup fc<n = sup fc |«s’ fc | we then 
have 

(9) m({x : s* > a}) < — / |«s n | dm. 

a J (Sn|>Q ： 

Multiplying both sides by pa p — 1 and integrating, using an argument 
similar to the one used in Section 4.1 of Chapter 2 yields 

J {s^Y dm < J \s n \ p dm. 

Now, the Khinchin inequality of Lemma 1.8 in the previous chapter, 
applied to the more general setting described in Exercise 10 gives 

s n \ p dm < A ( f \s n \ 2 
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Thus 

m(K>a})<^||4||i P <^|| 5n ||^. 

Setting a = An 1 / 2 and recalling that ||«s n || J t / 2 = n 1 / 2 completes the proof 
of the lemma. 

Let us now prove that the sequence { 卽 } converges weakly to a mea¬ 
sure /i. For this we use Corollary 2.3, and begin by showing that the 
sequence { 卽 } is tight, that is, for every e 〉 0 there is a compact subset 
K e of V so that < e for all N. 

To this end we will invoke Lemma 2.4 and first consider the situa¬ 
tion for T = 1, We fix 0 < a < 1/2, throughout the rest of the proof of 
the theorem. Then with our given e we will see that we can select a 
sufficiently large constant C\ so that 

(10) m({x : sup ~ 4^)1 〉 CiP for some 5 < 1}) < e. 

0<t<l, 0<h<S 
Therefore if we define 

/(:⑴ ={x : sup all 5 < 1}, 

0<t<l, 0<h<S 


and 


i^ 1 ) = {p : sup |p(t + ") — p(t)| < Ci5 a , all 5 < 1}, 

0<t<l, 0<h<S 

then m((/C^^) c ) = ^^((K^) 0 ) < 6. Note also that then (7) is satisfied 
for K — K^\ T = 1, and wi(5) = =Ci5 a , and hence is compact. 

In proving (10), let us first consider the analog of this set but with 5 
fixed, and 5 of the form 5 = 2 _fc , with k a non-negative integer. We then 
decompose the interval [0,1] via the 2 fc + 1 partition points {tj}, where 
tj = j5 = j2 _ k ， with 0 < j < 2 k . Next, observe that for any function / 
defined on [0,1 + 5], we have 

sup |/(t + /i) - /(t)| < 2max{ sup \f(tj + /i) - f(tj)\}. 
0<t<l, 0<h<S 3 0<h<5 


Thus with / ⑴二 S[ N ^ and any fixed cr 〉 0, 


2 1 


m{{ sup 

0<t<l, C<h<6 


s 


w 

t-\-h 


S { t N) \ > a}) < J]m({ sup \S { t ^ h 

j=o 


0<h<6 
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m({x : sup \S { t ^l - 5 t (iV) | >a}) = 0 

0<t<l, 0<h<6 


Here p is at our disposal. We now set a = Ci5 a , with a fixed 0 < a < 1/2. 
Then the O term becomes 0(c 「 p 5 b )，with 6 二 —1 + (| — a)p. Therefore, 
since a < 1/2 we can make b strictly positive by choosing p large enough, 
and then fix p. To summarize, with 8 = 2~ k we have proved 

m({x : sup \S^l - Sl N) \ > ci5 a }) = O (c 「 p 2 _fcb ). 

0<t<l, 0<h<S 

Now each 5, with 0 < 5 < 1, lies between 2 _fc+1 and 2~ k for some integer 
fc > 0. Thus when we take the union of the corresponding sets and add 
their measures (summing over fc) we get a total measure that is 0(c「 p ), 
and this is less than e if C\ is large enough. So we have obtained our 
desired conclusion (10). 

In the same way we can prove the following analog of this conclusion: 
for any T > 0, and ex > 0, there is a constant Ct sufficiently large so 
that m((/C^ T ^) c ) < where 

J^(T) _ | x . sup \^t+h ~ ^t N ^\ — c T^ a ? all 8 < 1}. 

0<t<T, 0<h<6 

This can be restated as follows. If 

= {p G P ： sup |p (亡 + /i) — p(^)| < cr^ a , all S < 1}, 

0<t<T, 0<h<6 


However m({x : sup 0 < h < 5 - S t [ N) | 〉 equals the same Quan¬ 

tity with tj replaced by 0, that is, it equals 

m({x : sup \S^\ > cr/2}), 

0<h<6 

and this itself equals m({x : sup n<SN |5 n (x)| > (a/2)N 1 ^ 2 }). These as¬ 
sertions follow from the definition (1) and the “stationarity” of the ran¬ 
dom walk: the fact that the joint probability distribution of (r m ,r m+ i, 
… ,r m+n ) is independent of m, for all m > 1 and n > 0. (Recall that 
{r n } are defined in Section 2.6 of the previous chapter.) 

Thus by Lemma 3.2, if we take A = cr/(25 1 / 2 ), then N 1 ^ 2 ^ = X(SN) 1 ^ 2 , 
and 


p 


b 


1 

Co 

2 


-Co 


then fiN((K( T 、) c ) = m((/C^ T ^) c ) < 
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Therefore, if we let T range over the positive integers, set e n = e/2 n , 
and K = P| 二 ！ K^ n \ then we have fxj\[(K c ) < e, and thus by the com¬ 
pactness of K guaranteed by Lemma 2.4, the tightness of the sequence 
{ 卽 } will be established. 


Now to show that the measure converges weakly it suffices, by Corol¬ 
lary 2.3, to show that for each 0 < t\ < t 2 < ''' < tk, the measures 
H … ’tk) converge weakly to the putative measure How¬ 

ever the central limit theorem (Theorem 2.17 of the previous chapter 
together with Exercise 1 below) shows that the distribution measures 

of S[^ — S [么 converge weakly to the Gaussian measure (see 

Exercise 1). Moreover, since 


d + (c° 





Exercise 8 (a) in the previous chapter shows that the distribution mea¬ 
sures of the vectors of random variables {S[^\ S[^ , …, S\^) converge 
weakly to the presumed measure 1^(*1 ”••，〜）as N — oo. Thus the se¬ 
quence { 卽 } converges weakly to a measure and this measure is then 
the desired Wiener measure W^ and this completes the proof of the the¬ 
orem. 


Our construction of Brownian motion was done in terms of the limit of 
scalings of the simple random walk treated in Section 2.6 of the previous 
chapter. However the Brownian motion process can also be obtained as 
a corresponding scaling limit of more general random walks, as follows. 

Let /i,..., / n ,... be a sequence of identically distributed mutually 
independent square integrable IR d -valued functions on a probability space 
(X, m), each having mean zero and the identity as its covariance matrix. 
Define, as in (1), 

q(N) = 1 r , ( Nt [ 洲 ) r 

_ 7 V 1/2 九十 ^1/2 / 网+ 1， 

l<fc<[ATt] 

and let 卽 be the corresponding measures on V induced via the measure 
m on X. The result is then that {^n} converges weakly to the Wiener 
measure W as iV —> oo. 

In this general setting the result is known as the Donsker invariance 
principle. The modifications needed for a proof of this generalization 
are outlined in Problem 2. A particularly striking example of the con¬ 
vergence to the Brownian motion process then arises if we choose the 
{/n} occurring in the process of “random flight” discussed in Problem 5 
in the previous chapter. 
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4 Some further properties of Brownian motion 

We describe now several interesting properties enjoyed by the Brownian 
motion process. In general it is useful to think of this process as either in 
terms of an abstract realization B t on (fi, P) satisfying conditions B-l, 
B-2 and B-3, or its concrete realization on {V, W) with W the Wiener 
measure, given in terms of = p ⑴， where cj is identified with p. 

More about this identification can be found in Exercises 8 and 9 below. 
It will also be convenient to augment the Borel <j- algebra of V by all 
subsets of Borel sets of VF-measure zero. 6 

We begin by observing three simple but significant invariance state¬ 
ments. (Another symmetry of Brownian motion is described in Exer¬ 
cise 13.) 

Theorem 4.1 The following are also Brownian motion processes: 

(a) 8~ l / 2 B t s for every fixed 5 > 0. 

(b) o(B t ) whenever o is an orthogonal linear transformation on R d . 

(c) 5 t+CJo — B ao whenever ao > 0 is a constant. 

We need only check that these new processes satisfy the conditions B-l, 
B-2, and B-3 defining Brownian motion. Thus the assertion (a) of the 
theorem is clear once we observe that for any function /, the covariance 
matrix of 5 一 V 2 / is S~ l times the covariance matrix of /. The asser¬ 
tion (b) is also obvious once we note that the covariance matrix of o(/) 
is the same as that of /; and that if /i ,..., / n ,... are mutually indepen¬ 
dent so are o(/i), 0 (/ 2 ), - -., o(/ n ), _Finally (c) is immediate from the 

definition of Brownian motion. 


The next result concerns the regularity of the paths of Brownian mo¬ 
tion. The conclusion is that almost all paths satisfy a Holder condition 
of exponent a, with a < 1/2; this fails however when a > 1/2. (This fail¬ 
ure extends to the critical case a = 1/2, but this is discussed separately 
in Exercise 14.) Moreover, almost every path is nowhere differentiable. 
The conclusions are subsumed in the theorem below. 


Theorem 4.2 With W the Wiener measure on V we have: 

(a) If 0 < a < 1/2 and T > 0 y then，with respect to W almost every 
path p satisfies 


sup 

0<t<T, 0<h<\ 


|p (亡 + / i ) — p ⑷ I 


< 00 . 


6 This is the completion of the measure space as outlined in Exercise 2, Chapter 6 of 
Book III. 
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(b) On the other hand，if a > 1/2 ， then for almost every path p 


lim sup 

h ― >0 


p (亡 + h) - p ⑴ 


= oo, 


for every t > 0. 


The first conclusion is implicit in our construction of Brownian motion. 
Indeed, suppose K ⑺ is the set arising in the proof of Theorem 3 丄 
Then we have seen that 卽 (/T( t )) > 1 — 6 for every N. Thus the same 
holds for the weak limit of the {/Mr}. Hence W(K^) >1 — 6. But by 
the definition of K( T ) we have the inequality in (a) for every p G K^ T \ 
Since e is arbitrary, the first conclusion holds. 

To prove the second conclusion we fix an a > 1/2, and a positive integer 
fc, so that dk(a — 1/2) > 1. 

Now, for any positive integer n, note that if there is a ^ [0,1] so 
that 

(11) sup |pfa ⑹ I < A , 

0<h<(4l)/n h- - 


then there is an integer jo, 0 < jo < n — 1 so that 


max 

\<i<k 


P 


jo + ^ + 1 


n 


P 


jo + € 


n 


< C k \n~ a , 


where Cfc = 2(fc + l) a . By renaming A, we may proceed assuming Ck = 
1. Thus if we let denote the set of path p where (11) holds, then 

松 C 纪 with 


n—1 


^n= U 


jo=0 


p G P ： max 
i<e<k 


P 


jo + i + l 


n 


P 


jo + i 


n 


< An 


— a 


But the k sets {P^ ： |P(^)-P(^)| < An 一 a }，1 < ^ < fc are 
mutually independent; also the measures of these sets are the same as £ 
and jo vary. Hence 

H/({peP:m^ fc |p (^^) - P (^)|< hi = 

= (W{peV ： \p(l/n)\<Xn~ a }) k . 


Thus W(E^) < W(E^) = n(W{p G V : \p(l/n)\ < An- a }) fc _ However, 
by the scaling property (a) of the previous theorem 

W{p : |p(l/n)| < An _a } = W{p eV : |p(l)| < An" 2-a }. 
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However p(l) has a Gaussian as its distribution measure. Thus the last 
quantity is 0(X d n d ^ l ^ 2 ~ a ^) as n ^ oc. As a result 

W(E^) = 0{\ dk nn dk ^ /2 - a) ) 

and this converges to zero as n ^ oo. Thus for every positive A, the set 
of p where (11) holds has measure converging to zero as n —> oo because 
a > 1/2. This establishes conclusion (b) of the theorem. 

At this point, it may be worthwhile to recall the variety of ways a 
nowhere differentiable function has arisen in different settings in these 
Volumes. First, as a specific example of a lacunary Fourier series in 
Book I; next as a von Koch fractal, in Book III; further as the generic 
continuous function via the Baire category theorem in Chapter 4; and 
now lastly as almost every Brownian path. 

One last remark. Given our construction it is intuitively tempting to 
think of almost every Brownian path as the “limit” of an appropriate 
collection of random walk paths (paths in Vn with N oo). However it 
is not clear how to make such an idea precise. Despite this, the following 
less satisfactory substitute is a direct consequence of Theorem 3.1. 

Let q G P be any fixed path. Suppose e > 0 and 0 < ti < t 2 < • * * < 
are given. We consider the open set 

O e = {peV \ \p(tj) - q(tj)| < e, 1 < j < n} 

of paths close to q, and set = O e (1 Vn, the bundle of corresponding 
random walk paths. Then 

(12) m({x G : (啦 f } 卜 W(O e ), as TV — oo. 

In fact, (12) is merely a restatement of the assertion > W {O e ) as 

N oo. This follows because — VF weakly, using Exercise 7, since 
it is easily checked that W{O e — O e ) — 0. 

5 Stopping times and the strong Markov property 

The goal of the rest of this chapter is to exhibit the remarkable role of 
Brownian motion in the solution of the Dirichlet problem. A general 
setting for this problem is as follows. 

We are given a bounded open set 咒 in and a continuous function 
/on the boundary &R = TZ — TZ. Then the issue is that of finding a 
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function continuous on 72., harmonic in 72., that is Au = 0, and with 
the boundary condition u\qji = /. 

The connection of this question with Brownian motion arises when we 
fix a point x £TZ and consider Brownian motion starting at x, that is, 
the process Bf — x -\- B t . Now for each a; G O we consider the first time 
t = t(cj) = when the Brownian motion path 1 1 -^ Bf(u) exits TZ 

(in particular, B^ (cj) (cj) = B^ x(uj) (u;) G dTZ). 


工 + (^) 



Figure 1 . Path cj, exiting at time r : 


Then the resulting induced measure fi x = fi on dTZ, given by 

^{E) = P({u;:B^ ) (u;)eE}) 

(also called “harmonic measure”）leads to the solution of the problem: 
under appropriate restrictions on the set TZ 

u ( x ) = [ f(y)d#(y )， xen 

Jon 

is the desired harmonic function. 

Now the function u t(cj) will be seen to be a “stopping time,” and 
we begin by discussing this notion, which arose implicitly when we proved 
the maximal theorem for martingale sequences in Theorem 2.10 of the 
previous chapter. 

5.1 Stopping times and the Blumenthal zero-one law 

Suppose {5 n }^Q is a martingale sequence associated to the increasing 
sequence {^4-n}^Lo of cr-algebras on the probability space (X, m). Then 
an integer-valued function r \ x ^ r(x) is a stopping time if {x : r{x) — 
n) E A n for all n > 0, or equivalently if {x : r(x) < n} E A n for all n. 



5. Stopping times and the strong Markov property 


255 


We note here the basic fact that if (say) r(x) < < oo for all x, then 


(13) 


J s r ^{x)dm — J sn(x) dm. 


Indeed, the left-hand side is f A s n (x) dx, with A n = {x : r(x)= 

n}. However, by the martingale property (that is, (14) in the previous 
chapter) f A s n {x) dx = f A sn(x) dx, and summing over n gives (13) 
above. 

Similarly, for a subset A, we say that the integer-valued function x i-^ 
r(x) defined on ^4 is a stopping time relative to ^4 if {x G A : r(x)= 
n} C A n for all n. In this case f A «s T ( x )(x) dx = f A sn(x) dx. When this 
is applied to A = {x : sup n<N s n (x) > a}, then this yields essentially the 
maximal inequality (15) in the previous chapter. 


Martingales are relevant to Brownian motion because that process is a 
continuous version of a martingale in the following sense. For each t > 0, 
let At be the a-algebra generated by all functions 0 < 5 < t, that is, 
the smallest a-algebra containing the Ab s for all 0 < 5 < t. 7 Then we 
have: 

(a) For any sequence 0 < to < < * • • < < * * * the sequence 

{Bt n }^ =0 is a martingale relative to the cr-algebras 


(b) For almost every cj, the path B t (uj) is continuous in t. 


Now (a) follows immediately from the proof of Proposition 2.6 in the 
previous chapter and the fact that the process B t has independent in¬ 
crements, with each Bt having mean zero. Also, (b) is the condition B-3 
arising in the definition of Brownian motion. 

At this point, because it will be useful below, we remark that the max¬ 
imal inequality (9) immediately leads to the Brownian motion inequality: 

(14) P({cj : sup |5 t (o;)| > a}) < -||5t||li 

0<t<T a 

for all T > 0 and a > 0. 


In analogy with the discrete case above we say that a non-negative 
function lj r{u) is a stopping time if {u : t(uj) <t}eAt for every 


t > 0. 


7 To be precise, At is the cr-algebra generated by all functions B s ，0 < s < t, together 
with all subsets of sets of measure zero. See also the previous footnote. 
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Now suppose 72. is a bounded open set of and define the first “exit 
time” for the path = x + B t {uj) to be 

t(uj) = r x (u) = inf{^ > 0, ^ TZ}. 

Also define the “strict” exit time r* = by 

<(cj) = inf{t > 0, Bf(cj) ^ n). 

Proposition 5.1 Both r x and rj are stopping times. 

We note that both r and r* are well-defined, that is, finite almost ev¬ 
erywhere, because almost every path ultimately exits the bounded open 
set TZ. (See Exercise 14.) 

Proof. For simplicity of notation we take x = 0; we can then recover 
the general case by reducing to the situation where TZ is replaced by 
TZ — x. Now for any open set O in define = in f{^ > 0, B t (uj) e 

O}. Then, up to a set of measure zero, 

t} = [j{B r (u ； )eO}, 

r<t 


where the union is taken over all the indicated rationals r. This is because 
a continuous path is in O before time t if and only if it is in O at 
some rational time r, with r < t. Thus {to(uj) < t} G At. Next let 
O n = {x : d{x^TZ c ) < l/n}. If t > 0, then 

(!5) {r(u) <t} = ^\{r 0ri {u) < t}, 


because a path exits TZ by time t, if and only if it is in O n before time t, 
for every n. Therefore, for t > 0 we have {t(cj) < t} E At ， However 
{+)= : 0} is the empty set or fi, depending on whether x E 72. or not. 
Thus t is a stopping time. 

Note that = t x {uj) > 0 for all cj if x G 7^ while = t x {uj )= 

0, for x 朱化 Therefore the only difference between and r x can occur 
when x is on the boundary, dTZ = 1Z ~TZ. We notice that as above, when 
t > 0, 


{ r * (^) < 0 ^ At- 

But then {r^(u) = 0} E f] t At- Given the increasing character of the 
cr-algebra At, it is natural to denote f] t At by ^o+- So the proposition 
follows from the lemma below. 
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Lemma 5.2 = Aq. 

The proof of this simple looking fact is however a little indirect. The 
conclusion, that any set ^ ^ rit>o ^ is trivial (is either of measure 0 or 
1), is referred to as Blumenthal’s zero-one law. (A generalization is given 
in Exercise 16.) 

As a result, for each x in the boundary of TZ we have the dichotomy: 
K(^)= : 0} has measure 1 or 0. In the former case, the point x is called 
a regular point at the boundary. In brief, a boundary point is regular, 
if almost all paths starting at that point are outside TZ for arbitrarily 
small positive times. This property plays a crucial role in the Dirichlet 
problem for TZ. 

Proof of the lemma. Fix a bounded continuous function / on R fcd , and 
a sequence Q < < t] < • •. < tk. For any 5 〉 0, set 

fs — ~ Bs ， B t2+ s — B tl+ s ，• • • ， B tk -\-s — B tk _ 1+ s)- 

If A is any set in ^4o+, then A G for 5 > 0. Then by the independence 
of the above increments from Bs, we see that 

f fsdP = P(A) f f 6 dP. 

J A JCt 

Thus by continuity of the paths we can let 5 ^ 0 and obtain 

f fodP 二 P(A) [ f 0 dP. 
ja Jn 

Now any bounded continuous function g on can be written in the 
form p(xi, … ， Xfc) =/(Xi, X 2 — xi, …， Xfc — Xfc_i) where / is another 
such function. As a result 

f g(B tl , … ， B tk )dP = P(A) f g[B t ” … ， B tk ) dP. 

Ja Jet 

Hence by a passage to the limit, this holds if g is the characteristic 
function of a Borel set of R kd . Thus P(A D 五 ） = P(A)P(E) whenever E 
is a cylindrical set. From this, we deduce the same equality for any Borel 
set E by using Exercise 4 in the previous chapter. Therefore P(A)= 
P(A) 2 , which implies P(A) = 0 or P(A) = 1. Since A was an arbitrary 
subset of 為 +, the lemma, and also the proposition, are proved. 

Note. Lastly, it will be important below to remark that the stopping 
time r x (u) is jointly measurable in x and uj. This follows from 

OO 

{(x ， cj) : t x (cj) > p} = u n {u : x + B r (u) G TZ n }, 

n—\ r<p, r€Q 
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where 7Z n = {x : d(x, TZ C ) > 1/n}. 

5.2 The strong Markov property 

Suppose cr is a stopping time (relative to the cr-algebras {^4 山 >0). We 
can define the collection A a to be the collection of sets A ， such that 
An{a(u) < t} G At, for all t > 0. One notes that in fact A a is a a- 
algebra; that A a — Aj 0 if ct(cj) is constant and equals cr 0 ; and that a is 
measurable with respect to A a . (See also Exercise 18.) 

In studying the Dirichlet problem we shall need, in addition to the 
stopping time r (the first exit time from TZ), another stopping time a. 
What happens when Brownian motion is restarted after time a is the 
subject of the “strong Markov property,” one version of which is con¬ 
tained in the following. 

Theorem 5.3 Suppose B t is a Brownian motion and a is a stopping 
time. Then the process , defined by 

is also a Brownian motion. Moreover is independent of A c . 

In other words, if a Brownian motion is stopped at time ct(cj), then the 
process which is appropriately restarted is also a Brownian motion that 
is now independent of the past A a - S 

Proof. We have already noted that if ct(cj) is a constant, ct(cj)= 
cr 0 , then is a Brownian motion (see Theorem 4.1), so the 

assertion in the theorem holds in this case. 

Next assume that g is discrete, that is, it takes on only a countable set 

of values a\ < cr 2 < .. • < cr< < - Also suppose Q < h < t 2 < … < tk 

are fixed. Let us use the temporary notation 

B = (B tl ,B t2 ， … ， B tk ) 

B* 二 （氓 ，巧 2 ,…，％ ) 

Bo ■” _ ， ... ， Bt k — 〜)， 

with all these bold-face vectors taking values in R kd . Now if E 1 is a Borel 
set in R kd , then 

{uj : B* G E} = : G and cr = cr^}. 

£ 


8 A corresponding independence when cr is an arbitrary positive constant is character¬ 
istic of a “Markov” process. 
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So 

{u ： B* eE}nA = u(>: g .£/}■ n 乂 n {(j = 

e 

with the union clearly disjoint. 

However if ^4 G then An {a = cr^} G A ar By the special case 
when a = ai is constant throughout, we see that the measure of {cj : 
B* G -B} H A equals 

P(B^ G E)m(A fi {cr = cr^}), 
e 

because An {a = cr^} £ Aa e and this set is independent of {B| G E}. 
However P(B| ^ E) = P(B G E), and we obtain that 

(16) P({u : B* gE,lu£A}) = Bg E})P(A). 

Now using (16) when A = ft shows that B* satisfies the conditions B-l 
and B-2 of Brownian motion. Also B-3 is obvious. Finally, using (16) for 
any A C A a gives the desired independence of B* from A^. 

Turning to the case of general stopping time cr, we approximate it by 
a sequence {cr( n )} of stopping times, so that each cr( n ) takes on only a 
countable set of values as above, and 

(i) cr( n )(u;) \ cr(cu), as n ^ oc, for every cu; and 

(ii) d ^^(n). 

For each n define cr( n )(u;) = k2~ n if (k — l)2 _n < ct(uj) < k2~ n for k — 
1 ， 2, ... ， and cr( n )(u;) = 0 if ct(cj) = 0. Property (i) is obvious. Next, for 
each t there is a fc so that k2~ n < t < (k 1)2— n . Then {cr( n ) < t}= 
{a < k2~ n } G Ak 2 ~ n ^ At. Thus cr( n ) is a stopping time. 

Also suppose that A C A a , then A D {cr( n ) < t} = An {cr < fc2 _n } G 
Ak 2 -n C At, and hence A G A a {n). Thus (ii) is established. 

Now let 5:( n ) be the analog of with a replaced by cr( n ), and let 

B 咖） (5 ^( n )， •.. ， 5 二 ( n )). Suppose A C A a (then A C A a {n)). Then 
by what we have proved in the discrete case 

P({B*^ G E ， ueA}) = P(B E E)P{A). 

A passage to the limit then shows that (16) holds for the general a. This 
limiting argument is carried out in two steps using exercises from the 
previous chapter. First, by Exercises 10 and 31 part (d), since 
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converges pointwise to B*, we have that (16) holds whenever E is an 
open set. To conclude that such equality holds for all Borel sets E, we 
apply Exercise 4 in the previous chapter. 

For any given stopping time cr, let us write B a for the function u) 

B 咖 )(cj). We note that the argument above, where we approximate the 
stopping time, also shows that B a is 人 -measurable. (See Exercise 18.) 

5.3 Other forms of the strong Markov Property 

Another version of the strong Markov property involves integration of 
functions defined on all paths. To describe this we need a little additional 
notation. We define V to be the space of all paths, that is, all continuous 
functions from [0, oo) to R d . The space V differs from the space V 
considered earlier, in that in the latter all paths start at the origin. 

We can write each p in "P as a pair (p, x) with p G P, x GR d where 

p = p — p(0), and x = p(0). So we have V = V x R d , and every function 

f on V can be written as /(p) = /i(p, x), with /i a function on the 

product V x M d . Moreover, V inherits a metric from the metrics on V 
and and a corresponding class of Borel subsets. 

We shall also use the short-hand that the path t B t (uj) will be des¬ 
ignated by B. (cj); similarly the path 1 B a ^ +t (uj) will be written as 
B a ^ + .(uj); also the paths 1 1 -^ B a ^ +t (uj) — B a ^(uj) that appear in 
Theorem 5.3 will be represented as With these definitions our 

result is as follows. 

Theorem 5.4 Let f be a bounded Borel function on the space V of all 
paths. Then 

(17) 

f /(S+h.M) dP{u) = [[ /( 剛 + S+,〆) ） dP(uj)dP(u; f ). 

JJ JQxQ 

Proof. We write /(p) = /i(p,x) as above; then (17) becomes 

(18) 

/ h )M) 件 ) 二 

Jq 

f[ fi ( 剛，〜 〆) ） dP{u)dP{J), 

J JQxQ 

since B a{uj)+t (uj) = B ； (uj) + B a{uj) (uj). 

We consider first functions fi of the product form f\ 二 /2 • / 3 , with 
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/!(p,x) = / 2 (P)/ 3 ⑷. Then the right-hand side of (18) is 

f f 2 (B.(u))dP(u)x f 

Jq Jq 

However f Q / 2 ( 5 .(cj)) dP(u) = f 2 (B*(cu)) dP(uj), since by Theorem 5.3 

is also a Brownian motion and so has the same distribution measures 
as B t . Also, by the independence guaranteed by that theorem (and the 
fact that 5 咖， )(cj’ ）is ^-measurable) we see that the product 


equals 






/ 3 (s 咖 ,〆 ) ） d/V) 





which is the left-hand side of (18). 

To pass to the case of general f we may argue as follows. Let /i 
and v denote the measures on V defined by ^(E), (respectively 
as the left-hand side, (respectively the right-hand side) of (18) whenever 
/is the characteristic function of E, with E any Borel set in V. Then 
what we have already proved implies that li{E) = iy(E) for all Borel 
sets of the form E = E 2 x £^ 3 , with E 2 C V and Es C According to 
Exercise 4 in the previous chapter, this identity then extends to the a- 
algebra generated by these sets, and hence to all Borel sets of V, because 
this cr-algebra contains the open sets. Finally, because any bounded Borel 
function on V is the bounded pointwise limit of finite linear combinations 
of characteristic functions of Borel sets, we see that (18) holds for all those 
/1 = /， and the theorem is proved. 

The final version of the strong Markov property we present is the 
statement closest to the immediate application to the Dirichlet problem. 
It involves two stopping times a and r, with cr < r, where r is the exit 
time for the bounded open set TZ. Let us recall that Bf (uj) = y 
and T y (cu) = inf{^ > 0 , ♦ TZ}. We define the stopped process 

对 ㈣ =y + B tAr y {oj) {u;) 

where t A r y (uj) = min ( 亡， r y (uj)). If y = 0 we drop the subscript y in the 
above definitions. 
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Theorem 5.5 Suppose a and 丁 are stopping times with cf(uj) < t(lj) for 
all u. If F is a bounded Borel function on R d , then for every t >0 


(19) 乂 F ( 々 +)») dP(u) 





F B 


y(^> 


QxQ 


’V)) dP(cj) dP(u f ) 


八 

where y(u/) = 

Proof. Start with the left-hand side of (19). It equals 

F ( 古 +)+») dP(u) + f F (B a{uj)+t (u;)\ dP(u) 

r(uj)>o-(uj)-\-t ^ Jr{uj)<.a{uj)-\-t ^ 

F (s— )+ (M) dP(cj) + [ F (5 rM (a;)) dP(u) 

(uf)>o-(uj)-\-t J r(uj)<a(uj)-\-t 

h + / 2 . 


We will first look at 

II = I F {^a{uj)-\-t (^)) Xr (u;)>cr(a;)+t 

JQ ~~ 

Consider the following real-valued function on paths: 

/(p) = F(p(t)) Xr( p)> t . 

Here we define for any path p the quantity r(p) = inf{«s > 0 : p(^) i ^}. 
In particular, note that if p(-) = 5.(a;), then r(p) = t{u). Now, given u 
set p(-) = B a ^ )+ Xu). Then 

/(p) = /(^ M+ .(a;))=F(5 CT(a;) 

~\~t (^) ) (u;) — cr(a；) • 

Indeed, note that 

r(B a{uj)+ .(^)) = inf {5 > 0 : ^Tl} = r(u) - ct(cj). 

This is true because the path B. (u) exits at time 丁 (u), and therefore 
the path B a ^ + .(u) exits at time r(u) — Therefore 

f (-®cr(a;) + - (^)) = ^ (-®cr(a;) + t) Xr(a;)>cr(a;) + t? 

which is the integrand in /i, so we can apply (17) to get 

h = f f /(S CT(u/) (a/) + RM) dP{u)dP{u f ). 

JqJq 
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But now note that the integrand in the above equals 


To conclude the calculation of /i it suffices to note that the quantity 
r (B 咖 ’)(u/) + equals and so 


h = F + B t {u)) Xrv^')^)^ dP{u) dP{u) 

J J 


nJn 


nJn 


f(W)m) (uj)>t dP(uj) dP(u f ) 

F ( 对 (u/) M) Xrv^H^tdPi^dP^). 


We now look at the second integral I 2 defined by 

h= F (B t{uj) {uj)) Xr^Ka^+t dP{u). 

Jn 

Here we define a real-valued function on paths 

^(P) = F(p(r(p)))xr(p)<t. 

Setting W.) = B a{uj)+ \u) gives 

9{ B a{uj)+X^)) = F ( 5 +)( 0 ；)) Xr(uj)<a(uj)+t - 

For the characteristic function the argument is the same as above. 
For the first part (that is, the component F(...note that r(p) gives 
the time of exit of TZ of the path p, and p(r(p)) the value (in R d ) where 
the path exits. Since both 5+)+. (a;) and B.{u) exit at the same point 
in space (although at different times, namely, r{u) — cr(a;) and r(u) re¬ 
spectively) we get the above. Therefore by (17) 

h = [ [ g{B a{ujl) {u t ) + B\u)) dP(u) dP(u f ). 

JnJn 

Now note that 


9 + B.(u)) = F (^B a 



264 Chapter 6. AN INTRODUCTION TO BROWNIAN MOTION 


Hence 


12 = 



dP(u)dP(u f ) 

F ）+ B—o ； ’) ㈣ (u;)) Xt 咖 ')(。；)<t dP(to) dP(to ) 

f (f )M) 件 ，)• 


Therefore, putting the two integrals for I\ and I 2 together yields 


Ii +12 = 



P(4 y(a/) (cj)) dP(u)dP(u f ), 


which completes the proof of (19). 

Final remark. With almost no change in the argument one can prove 
generalizations of the two theorems above in which the left-hand side 
of (17) and (19) are integrated over any set A in 人 ， instead of fi. 
The result corresponding to (17) may then be rephrased in terms of 
conditional expectations to read: 


E^(/(^ )+ .))= / f(B.(u;) + x)dP(u;)\ 


The conclusion corresponding to (19) is 


A 


F (ArM+tM) dP{u) 



Q 




whenever A G At. 


6 Solution of the Dirichlet problem 

Recall the definitions given at the beginning of Section 5. Here 72. is a 
bounded open set in and for each x G 72., we define [i x as the measure 
on the boundary dTZ of 72., given by 

fx x (E) = P{{u : G E}), 


with t x (uj) the first exit time of the path Bf (cj). Here E ranges over the 
Borels sets of dTZ, which itself is a compact subset of IR d . 
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For a continuous function / on dTZ we defined 

(20) u(x) = I f(y) dfi x (y), when x G TZ. 

Jdn 

Observe that u is measurable (and in fact Borel measurable) since 

u(x) = f /(x + B r x (u;) (cj))dP(cj), 

Jn 

and t x (uj) is jointly measurable, as noted at the end of Section 5.1. 

The main theorem is as follows. 

Theorem 6.1 If u is defined by (20) ， then: 

(a) u is a harmonic function in TZ. 

(b) u(x) f(y), as x — y ， for x £ TZ，if y is a regular point of dTZ. 

Proof. To establish (a) we fix x G 7?，and let S denote a sphere 
centered at x together with its interior ball is contained in TZ. We will 
prove the mean-value property 

(21) u(x) = / u(y) dm(y), 

Js 

where m is the standard measure on the sphere, normalized to have total 
mass 1. To prove (21) let a be the stopping time defined as the first time 
Bf (cj) hits S. 



We claim that for any continuous function G on 5 we have 


( 22 ) 


f Gd) ㈣) 件 i) 

Q 


二 G(y)dm{y). 

Js 
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To see this, consider the case where a: 二 0， and note that the left-hand 
side defines a continuous linear functional on the continuous functions 
on 5, and hence is of the form f s G(y) d/i(y), for some measure /i on S. 
By the rotation invariance of the Brownian motion it follows that fx is 
rot at ion-invariant and hence by Problem 4 in Chapter 6, Book III， we 
have fx = m. 

Suppose — Bf Al . x is the stopped process. Note that = 

= y(ou'i) G 5, because a path starting at x meets S before it 

meets dTZ. 

We now invoke (19). If we take F to be any continuous bounded 
extension of / to all of and let t — oo we obtain 
(23) 

/Lq dP{u；2) dP{ui) 二 I/ ， 今、 (u))dP{u). 


The right-hand side of (23) above equals u(x) : while the left-hand side 
equals 





Finally, since = y(cji) we can apply (22) with u ~ G and de¬ 

duce that 



uiyiux)) dP(ui) 


= / u(y) dm(y) • 

Js 


This completes the proof of the mean-value identity (21), and from this it 
follows that u is harmonic. The ideas behind the proof of this well-known 
fact are summarized in Exercise 19. 

To prove conclusion (b), we establish first that if y G and y is 
regular, then 


(24) lim P({t x > 5}) = 0, for all 6 > 0. 

x—*y y xETZ 

In fact, P({B^ G 72., all 6 < t < 5}), 6 > 0, is continuous in x, because 
at each u for which B t is continuous, the characteristic function of 
{Bf G TZ, all 6 < t < 5} at cj converges to the characteristic function of 
{5f G 72., all 6 < t < 5} at cj, as x ^ y. However the functions P({Bf G 
TZ, all e < t < 5}) are decreasing as 6 \ 0. The limit is 

P({u : Bf(u;) G n, all 0 < t < 6}) = P({r x > 5}) 
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and is thus upper semi-continuous in x. Hence lim sup x — y P({r x > ^}) < 
P({r y > 州 二 : 0, since y is a regular point. Thus (24) is established. As 
a consequence we have for 5 > 0 and 6 > 0 given, 

(25) P({uj : \y - B^ x{lj) {uj)\ > s}) <e 

if x is sufficiently close to y G dTZ. In fact, by the maximal inequality (14) 
we can find a 5 > 0, so that P({uj : sup t<6 \B t (uj)\ > s/2}) < e/2, since 
HB^II^i = cJ 1 / 2 . Also by (24), if x is sufficiently close to y, P({r x > 
5}) < e/2. As a result, if x is sufficiently close to y, (25) holds. 

Now 

- f(y) = [ (/(〆 )一 f(y)) dfj, x (y f ) = f + [ =h + / 2 . 

Jan Jdni Jdn 2 

Here 9 咒 1 is the set of y’ in so that \y , ~ y\ < s and 9 咒 2 is the 
complementary set in dTZ. Now the points y , G dTZ are of the form y , = 
while 〆 (dn 2 ) = P({uj ： \y- > 5}). Thus by (25) we 

see that fi x (d1Z2) < e if x is sufficiently close to y. So the contribution 
of 7*2 is majorized by 2 sup \f\fx x (d 1 Z 2 )e = 0(e). Also \ f(y) — f(y , )\ < e 
if \y - y f \ < s and 5 is small enough, so the contribution of I\ can be 
made less than e. Altogether this shows that u(x) — f(y) is majorized 
by a multiple of e for x sufficiently close to y. Since e was arbitrary, the 
second assertion of the theorem is proved. 

Our final result is a very useful sufficient condition for the regularity 
of a boundary point. A (truncated) cone T is the open set 

r = {y G : \y\ < a(y - 7 ), \y\ < 5}. 

Here 7 is a unit vector, a > 1 , 5 > 0 are fixed, and y • 7 is the inner 
product between y and 7 . The vector 7 determines the direction of the 
cone, and the constant a gives the size of the aperture. 

Proposition 6.2 Suppose x G dTZ and x + T is disjoint from TZ, for 
some truncated cone T. Then x is a regular point 

Proof. We assume x 二 0, and consider the set A of Brownian paths 
starting at the origin that enter T for an infinite sequence of times tending 
to zero. Let A n = U rfc<1 ^ n {cj : B rk (u) £ T} where is an enumeration 

of the positive rationals. Then A = However A n G A n for 

each n, and hence A G Ao-\- = Ao, by the zero-one law. So m(A) = 0 or 
m{A) — 1 , and we show that in fact m{A) = 1 . Assume the contrary, that 
is m ⑷ = : 0. By the rotation invariance of Brownian motion, the same 
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Figure 3. Truncated cone at x disjoint from TZ 


result would hold for any rotation of our truncated cone, and finitely 
many such rotations cover the ball of radius 5, with the origin excluded, 
while every path enters that ball at arbitrarily small times. This is a 
contradiction. 

Now returning to our boundary point x, if x + T is disjoint from TZ, 
then there are, for each cj, arbitrarily small times for which B t {uj) G T, 
and hence Bf(u) ^ TZ. Thus x is regular. 

In view of the above we say that a bounded open set TZ satisfies the 
outside cone condition, if whenever x G dTZ, there is a truncated 
cone r, so that x + T is disjoint from TZ. Our final result generalizes 
the theorem proved by very different methods in Chapter 5, Book III 
only for the special case of two dimensions. 

Corollary 6.3 Suppose the bounded open set TZ satisfies the outside cone 
condition. Assume f is a given continuous function on dTZ. Then there 
is a unique function u that is continuous in TZ y harmonic in TZ f and such 
that ^) 97 ?. = f • 

Proof. Theorem 6.1 and Proposition 6.2 show that u is continuous 
in TZ and u\dn — /• The uniqueness is a consequence of the well-known 
maximum principle. 9 

7 Exercises 

1. Show that if 亡 > 0, then the distribution measure of S ^ N 、converges weakly 
to the Gaussian vt with mean zero and variance t as N —>• oo. More generally, if 
i > s > 0, then the distribution measure of — Si N ^ converges weakly to the 
Gaussian i/t- s with mean zero and covariance matrix (i — 

[Hint: Using the notation in the remark following Theorem 2.17 in Chapter 5, and 
setting / fc = r fc , one has S^ N) - S N 、 t = + 


9 See for example Corollary 4.4 in Chapter 5 of Book III. 
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2. Let (V, d) be the metric space defined in Section 2. Verify: 

(a) The space is complete. 

(b) The space is separable. 

[Hint: For (b), let ei,..., be a basis for IR d , and consider the polynomials 
p(t) = eip\(t) + • * + edPd{t), where the pj have rational coefficients.] 

3. Show that the metric space ("P, d) is not cr-compact. 

[Hint: Assume the contrary. Then the Baire category theorem implies that there 
exists a compact set that has a non-empty interior. As a result there exists an open 
ball whose closure is compact. However, consider for example the ball of radius 1 
centered at 0, and a sequence of continuous piecewise linear functions {/ n } with 
/ n (0) = 1, f n {x) = 0, when x > 1/n.] 

4. Suppose X is a compact metric space. Show that: 

(a) X is separable. 

(b) C(X) is separable. 

[Hint: For each m, find a finite collection Bm of open balls, each of radius 1/m, so 
that the collection Bm covers X. For (a) take the centers of the balls in |J==i 石爪 • 

For (b), consider {r )^} the partition of unity corresponding to the covering of X 
by Bm (as given, for example, in Chapter 1). Show that the finite linear combina¬ 
tions of the with rational coefficients are dense in C(X).] 

5. Let X be a metric space, K C X a, compact subset, and / a continuous function 
on K. There there is a continuous function F on X, so that 

F\k = /, and sup |F(x)| = sup \f(x)\. 

xex xeK 

[Hint: The argument given in Lemma 4.11, Chapter 5 of Book III for X = R d can 
be copied over in this general setting.] 

6. Suppose K is a compact subset of V. Show that for each T > 0, there exists a 
function wx(h), defined for h € (0,1] with wr(h) —^Qash—^Q and such that 

sup sup \p(t h) — p(i)| < wr(h), for h G (0,1]. 

peK 0<t<T 

[Hint: Fix T > 0 and e > 0. Each p is uniformly continuous on closed intervals, 
so there exists 6 = 5(p) > 0 so that sup 0<f<T \p(t + h) — p(i)| < e whenever 0 < 
h <. 5. Now use the fact that since K is compact, the covering K C Up{P r ^ V : 
叫 p’ ， p) < e} has a finite subcover.] 

7. Suppose fj，N —^ M weakly. Show as a result: 
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(a) liminfiv—oo 11 n( 0 ) > ii(0) for any open set O. 

(b) lirriiv—oo iin{0) = /x(O), O an open set, if ji{G — 0) = 0. 

[Hint: fi(O) = supy {/ /d/x, where 0 < / < 1 and supp(/) C O}] 

8. Given the Wiener measure W in "P, we have a realization of Brownian motion 
(satisfying B-l, B-2 and B-3) with Bt{u) = p ⑴， = V ， and P = W. Conversely, 
suppose we start with {B t } satisfying B-l, B-2 and B-3. For any cylindrical set C = 

{p ： (p(ii),... ， p(ifc)) € A} mV, define W°(C) = P({u : (B tl (^)，•..€ 
A}). Verify that W°, initially defined on the cylindrical sets, extends to the Wiener 
measure on V. 

9. This exercise deals with the degree to which the Brownian motion process is 
uniquely determined by the properties B-l, B-2 and B-3. 

Let us say that such a process is “strict” if in addition to the above it satisfies 
the following two conditions: 

(i) B t (ui) = B t {u) 2 ) for all t implies ui = U 2 . 

(ii) The collection of measurable sets of is exactly ^oo, which is the 

a-algebra generated by the At, with t < oo. 

Now given any Brownian motion process B t on (fi, P) it induces a strict process Bf 
on (Q# 、 P 井 、 as follows: Let 0 >井 denote the collection of equivalence classes on Q 
under the equivalence relation u；i 〜 U 2 if — B t {u} 2 ) for all t. We also denote 

by {u;} the equivalence class to which u belongs. On define Bf{{u }) : 二 B t {u))^ 
and P^({A}) = P(A) if A ^ /oo. Verify: 

(a) Bf is a strict Brownian motion on (17# ， P # ). 

(b) The process ("P, W) constructed in Section 3 is a strict Brownian motion. 

(c) If and (Bln 2 ,P 2 ) are a pair of Brownian motion processes, 

then up to subsets of sets of measure zero, there is a bijection $ : —► 

so that (P 2 ) # (^(A)) = (P 1 ) # (A) and (B t 2 ) # (^>(o;)) = 

10. Prove the following version of Khinchin’s inequality (Lemma 1.8 in the pre¬ 
vious chapter). Suppose {/ n } are identically distributed R d -valued functions that 
are bounded, have mean zero, and are mutually independent. Then for any p < oo, 
we have 

||^a n / n || L p < A p ^^|a n | 2 ^ /2 . 

[Hint: One can reduce to the case d~l. Assuming |a n | 2 g 1， write f e^ an ^ n = 
Fin / e an ’ n and use e u = 1 + u + 0(u 2 ) if \u\ < M. The result is that the first 
integral above is majorized by f|(l 4 - M 2 a^) if \fn\ < M for all n] 

11. Prove the following variant of Lemma 3.2. Suppose {fk}^ = i is a sequence of 
identically distributed, mutually independent, M d -valued functions on a probability 
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space (X, m) each having mean zero and the identity as its covariance matrix. If 

Sn = ^21=1 then 

limsupm({x : sup |sfc(x)| > An 1 ^ 2 }) = 0(A -P ), for p > 0. 

n—»-oc 1 <fc<n 

[Hint: If v n denotes the distribution measure of Sn/n" 2 and a = An 1 / 2 , then the 
right-hand side of (9) equals j i^〉；\ |^| diy n (t). With A > 1 fix M > 1 and write 

this last integral as the sum of two terms A -1 +A -1 / A M>| t | >A - Using the 

fact that f 丄 dm = 1, the first term is 0(X~ 1 ~ M ). By the central limit theorem 
lirrin—oo A _1 / AM>|t|>A I 亡 I “〜⑴ = O (A -1 / |t|>A M|e _|t|2/2 出 ) so the limit of the 
second term is also 0(A -1-M ).] 

12. Prove that almost everywhere 

\Bt(u)\ = 0(i 1//2+e ), as t —> oo, 

for every e > 0. This is the analog of the strong law of large numbers given in 
Corollary 2.9 of the previous chapter. 

[Hint: If denotes sup 0<t<T |Bt(u;)|, then the maximal inequality (14) gives 

W({Bt > a}) < = c ，T ^. If E k = {B* k > 2 备 （1+e) }， then we have 

E fc >o^ fc ) = o(E fc > 0 2-^) <^.] 

13. If B t is a Brownian motion process then so is B f t = tBi/ t . 

[Hint: Note the continuity of almost all paths of B[ at the origin follows from the 
previous exercise. To verify property B-2, use Exercise 29 in the previous chapter.] 


14. Show that lim sup t — 0 ^^)1 = oo almost everywhere; hence almost all Brow¬ 
nian paths are not Holder 1/2. 

Also show that limsup t _ >00 I)%)! = oo almost everywhere; hence almost all 
Brownian paths exit every ball. 

[Hint: By the previous exercise it suffices to check the result when 亡一 ► 0. Consider 
d — 1. Then 


W({|B a -恥 I > 7 }) = 


1 - f du, 

•y/27T(/3 — a) J |w|>7 


if /3 > a. 


Thus 

W({\B 2 ^ k -B 2 - k+ i \ > 2~ k/2 ii k }) > ~^= [ e~ u2/2 du > cie _c2 ^. 

V j |u|>^x fc 


_ 2 . 
Now choose /Xfc —+ oo so slowly that ^ fc>0 e -C2Mfc = oo and apply the Borel-Cantelli 

lemma (Exercise 20 in the previous chapter).] 
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15. Calculate the (joint) probability distribution measure of {Bt x , Bt 2 ,..., Bt k ). 
[Hint: Use Exercise 8 (a) in the previous chapter.] 

16. Show that the following generalization of the fact that Ao+ = Ao holds: if we 
define At+ to be p| s>t then At+ = At. 

17. The previous exercise gives the right-continuity of the collection {^ s }. Prove 
the following left-continuity for every t > At = At- , where At- is the cr-algebra 
generated by all A s for s < t. 

[Hint: Consider first cylindrical sets in At] 


18. Let <j be a stopping time. Show that: 

(a) a is ^-measurable. 

(b) is 九 -measurable. 


A A 

(c) Act is the cr-algebra determined by the stopped process Bt with Bt(uj)= 
BtAcr(u>) (^) • 

[Hint: For (a), note that {cr(w) g a} fl {cr(u;) < t} ~ {cr(cj) < min(a,i)}. For (b), 
show first that for any Borel subset E of R. d and t > 0, one has G 

E} fl {<j $ i} € whenever o takes on only discrete values. Then approximate o 
by as in the proof of Theorem 5.3.] 


19. Let it be a bounded Borel measurable function on a bounded open set 1Z C 
Suppose that u satisfies the mean-value property on spheres, that is, (21). 

(a) Show that if B is a ball contained in 1Z and centered at x, then 

乜⑻ = —THY / 

where m is the Lebesgue measure on R d . 

(b) As a result, the function u is continuous in 1Z and the argument in Sec¬ 
tion 4.1, Chapter 5 of Book III shows that the function u is harmonic in 1Z. 

[Hint: For (b), show that locally, u(x) = (u * ^p)(x), where (/? is a smooth radial 
function supported on an appropriately small ball and with f = 1.] 

20. An bounded open set 1Z has a Lipschitz boundary if dlZ can be covered 
by finitely many balls, so that for each such ball B, the set dlZ D B can (possibly 
after a rotation and translation) be written as Xd = <f(x \,..., Xd-i)^ where (/? is a 
function that satisfies a Lipschitz condition. 

Verify that if 1Z has a Lipschitz boundary, then it satisfies the outside cone 
condition. Thus, in particular, if 1Z is of class C 1 (in the sense of Section 4 in 
Chapter 7) then 1Z satisfies the outside cone condition. 
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So in these cases the Dirichlet problem is uniquely solvable. 


21. Suppose 1Z\ and 咒 2 are two open and bounded sets in with 1Z\ C 7^2- Let 
and ji 2 denote the harmonic measures of and IZ 2 respectively, as defined at 
the beginning of Section 5. Show that the following generalization of the mean- 
value property ( 21 ) holds: whenever x G 7^i, then 


M2 = 


/ 




in the sense that = f dlZi dfj ， i(y) for any Borel set E C dl^ 2 . 


8 Problems 

1 . The condition of continuity of Brownian paths B-3 is in effect a consequence of 
properties B-l and B-2. This is implied by the following general theorem. 

Suppose that for each t > 0 , we are given an L p function Ft = F t (x) on the space 
(X, m). Assume that \\Fty — F t2 \\lp < c|ti — 亡 2 P，with a > 1 /p, and 1 < p < 00 . 
Then there is a “corrected” Ft, so that for each t, Ft = Ft (almost everywhere with 
respect to m), and so that 1 1 -^ Ft(x) is continuous for all 亡 > 0 , for almost every 
x ^ X. Moreover the functions t ^ Ft(x) satisfy a Lipschitz condition of order 7 
if 7 < a — 1 /p. 

2. The proof of the Donsker invariance principle follows along the same lines as the 
proof of Theorem 3.1. Let / 1 , …， / n , … be a sequence of identically distributed 
mutually independent square integrable R d -valued functions on a probability space 
(X, m), each having mean zero and the identity as its covariance matrix. Define 

▲ E / + 

l<fc<[Nt] 

and let {/xn} be the corresponding measures on V induced via the measure m on 

X. 

(a) Instead of Lemma 3.2 use Exercise 11 to show that for T = 1 , 77 > 0 and 
<j > 0, there exists 0 < 5 < 1 and an integer Nq so that for all 0 < t < 1 one 
has 

m({:r : sup \S^l — I > cr}) < ^ 77 , for all N > No. 

0<h<S 

(b) Deduce from the above that for all T > 0, € > 0, and cr > 0 there is a 5 > 0 
so that 

m({x : sup \^t+h ~~ *^^)1 > a}) S € ， for all > 1 . 

0<t<T, 0<h<S 

(c) Use the inequality in (b) to show that the sequence { 卽 } is tight. 
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(d) Conclude as before that { 卽 } converges weakly to W. 


3. There are a number of other constructions of Brownian motion besides the one 
given in this chapter. A particularly elegant approach is based on simple Hilbert 
space ideas. 

On (Q, P) consider a sequence {/ n } of independent, identically distributed Re¬ 
valued functions with Gaussian distribution of mean zero and covariance matrix 
equal to the identity. Observe that the sequence {/ n } is an orthonormal sequence 
of L 2 (Q,R d ). Let Ti denote the closed subspace of L 2 (Q, R d ) spanned by {/ n }. 

Observe that is a separable infinite dimensional Hilbert space. Hence there 
is a unitary correspondence U between L 2 ([0, oo), dx) and Ti. Let B t = U (xt) 
where \t is the characteristic function of the interval [0, t]. Then, each Bt can be 
corrected as in Problem 1, so that the process {B t } becomes Brownian motion. In 
this connection see also Exercise 9 in Chapter 5. 

Note that for instance, if B t = Yl c n(t)f n , then Bt — B s = JZ[c n ⑴ — c n (s)]/ n 

With |c n (i) - C n (5)| 2 = t — S. 

4. * In the previous chapter, we noted that recurrence results for the (discrete) 
random walks depend on the dimension d, and in particular, whether d <2 or 
d> 3 (see Theorem 2.18 in Chapter 5 and the remark that follows it). 

One can establish the following results for the (continuous) Brownian motion 
B t in 

(a) If d = 1, Brownian motion hits, almost surely, every point infinitely often, 
in the sense that for each x € R and for any to>0, 

P({u : Bt{uj) = x for some t > to}) — 1 . 

Thus Bt is pointwise recurrent in R. 

(b) If d > 2, then for every point x G Brownian motion almost surely never 
hits that point, that is, 

P({u) : B t {u)) — x for some t > 0}) = 0. 

So, in this case, Brownian motion is not pointwise recurrent. 

(c) However if d = 2, then B t is recurrent in every neighborhood of every point, 
that is, if D is any open disc with positive radius, and to > 0, then 

P({uj : Bt{uj) G D for some t > to}) — 1. 

(d) Finally, when d > 3, Brownian motion is transient, that is, it escapes to 
infinity in the sense that 

P({oj : lim \B t (uj)\ = oo}) = 1. 

t—^OO 
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5.* The law of the iterated logarithm describes the amplitude of the oscillations 
of Brownian motion as i > oo and 亡一 > 0: if Bt is an IR-valued Brownian motion 
process, then for almost all u 


limsup 


B t {u) 


y/2t log logt 


1 , lim inf 


B t (u) 


y/2t log logi 


1. 


By Exercise 13, time inversion implies that for almost all uj 


limsup 
t—o 


B t (u) 

log log(l/ 亡 ) 





6 .* There is a converse to Theorem 6.1 when d > 2 : if u(x) f(y) as x ^ y with 
x 6 7Z, for each continuous function /, then y is a regular point. 

[Hint: If y is not regular, then, using Problem 4* (b), show that P({\B^. y — y\> 
0}) = 1, hence P{{\B^ y — y\ > (5}) > 1/2 for some (5 > 0. If S e denotes the sphere 
centered at y of radius e < S, use the strong Markov property to prove that there 
exists : r e 6 fl 尺 so that P({\B^x e — y\ (^}) > 1/2. Then, considering any con¬ 
tinuous function 0 < / < 1 on 7?. with f(y) = 1 , and f(z) = 0 whenever \z — y\ > 8, 
leads to a contradiction.] 

7* A simple example of a non-regular point arises when we remove from an open 
ball its center, with the center then becoming a non-regular point. A more inter¬ 
esting example of a non-regular point is given by Lebesgue’s thorn with its cusp 
at the origin. 

Suppose d > 3, and consider the ball B - = {x€R d ： |x| < 1 } from which we re¬ 
move the set 

E = {(xi ,... ,Xd) €R d : 0 $ Ti < 1， d + … + d $ /( 工 l)}. 

Here / is continuous and f(x) > 0 if x > 0. If f(x) decreases sufficiently rapidly 
as x —> 0, then the origin is non-regular for the set 1Z = B — E. Clearly, 1Z can be 
modified so that its boundary is smooth except at the origin. 



I A Glimpse into Several 
Complex Variables 


In dealing with the existence of solutions of partial dif¬ 
ferential equations it was customary during the nine¬ 
teenth century and it still is today in many applica¬ 
tions, to appeal to the theorem of Cauchy-Kowalewski, 
which guarantees the existence of analytic solutions 
for analytic partial differential equations. On the other 
hand a deeper understanding of the nature of solu¬ 
tions requires the admission of non-analytic functions 
in equations and solutions. For large classes of equa¬ 
tions this extension of the range of equation and solu¬ 
tion has been carried out since the beginning of this 
century. In particular much attention has been given 
to linear partial differential equations and systems of 
such. Uniformly the experience of the investigated 
types has shown that — speaking of existence in the 
local sense — there always were solutions, indeed, 
smooth solutions, provided the equations were smooth 
enough. It was therefore a matter of considerable sur¬ 
prise to this author, to discover that this inference is 
in general erroneous. 

H. Lewy } 1957 


When we go beyond the introductory parts of the subject, what is 
striking is the extent to which the study of complex analysis in several 
variables differs from that of one variable. Among the new features 
that arise are: the automatic analytic continuation of functions from 
certain domains to larger domains; the crucial role of the tangential 
Cauchy-Riemann operators; and the significance of (complex) convexity 
properties of boundaries of domains. 

Even though the subject has developed far exploiting these concepts, 
it is our purpose here to give the reader only a first look at these ideas. 


1 Elementary properties 

The definition and elementary properties of analytic (or “holomorphic ”） 
functions in C n are straight-forward adaptations of the corresponding 
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notions for the case n = 1. We start with a bit of notation. For any 
z° = (zj,..., z^) G C n and r — (n ， ... ，厂 n) with rj > 0, we denote by 
P r (z°) the polydisc given by the product 

P r (2°) = {2 = (21,..., z n ) G C n : \zj — Zj \ < for all 0 < j < n}. 

We will also set C r (z°) to be the corresponding product of boundary 
circles 


C r (z°) = {2 = ( 21 ,..., z n ) G C n : \zj — Zj \ — Tj, all 0 < j < n}. 

We also write for the monomial z^ 1 z ^ 2 … where a = (ai , a n ) 
with otj non-negative integers. 

We shall see below that for any continuous function / on an open set $1, 
the following conditions, defining the analyticity of /, are equivalent: 

(i) The function / satisfies the Cauchy-Riemann equations 


⑴ 


dz 3 


0 , for j = 1 ,,n 


(taken in the sense of distributions). Here 


dl_ 

dz 3 ~ 2 


2 \dxj 




.df_ 

l d yj 


and Zj = Xj + with Xj^yj G M. 


(ii) For each z° £ Q and 1 < k < n, the function 

9 ^ k ) ~ f (^1 ? ，之 fc — 1，之 fc ， 之 fc + l ， ■.. ，之 n ) 

is analytic in (in the one-variable sense) for in some neigh¬ 
borhood of z^. 

(iii) For any polydisc P r (z°) whose closure lies in Q we have the Cauchy 
integral representation 


⑵ 


n 


/ ⑷ 


( 2 丌 Jc r ( z ^) 


/(o n 




k=l 


Cfc — z k 


for 2 G F r ( 2 °). 


(iv) For each z° G $1, the function / has a power series expansion f(z)= 
^2a a (z — z°) a that converges absolutely and uniformly in a neigh¬ 
borhood of z°. 
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Proposition 1.1 For a continuous function f given in an open set Q, 
the conditions (i) to (iv) above are equivalent. 

Proof. To see why (i) implies (ii), let A be the Laplacian on C n , 



with Zj = Xj + iyj, and where C n is thus identified with R 2n . Note that 
then 



where 鸯 =* (篆 + 2 截 ） and 截 = S ( 篆 - 2 • 截 ) ， so if / satis_ 

fies (i) (in the sense of distributions), then in fact A/ = 0 . From the 
ellipticity of the operator A and its resulting regularity (see Section 2.5 
of Chapter 3 ) we see that / is in C°°, and in particular in C 1 . Thus 
the Cauchy-Riemann equations are satisfied in the usual sense and (ii) 
is established. 

Now suppose 2 G P r (2°), with P r (z°) C $ 1 . Then if (ii) holds we can 
apply the one-variable Cauchy integral formula in the first variable, with 
22, 之 3,. • • ，之 n fixed, to obtain 


/ ⑷ 


2ni 


ICi-2?l=n 


/(Ci ， z 2 , … ，之 n) 


dCi 


Ci ~ z i 


Next, using the Cauchy integral formula in the second variable to repre¬ 
sent /(Cl, 2^2, •••，〜) with G ， z 3 ,. •. ， fixed, gives 


/ ⑷ 


/(Cl ，（ 2, ... ， Z n ) 

(27ri) 2 7| Ci _ 2 o )=ri 7| C2 _ 2 o )=r2 (C 2 - 之 2)(Ci - 之 l) 


成 2 成 1 


Continuing this way yields assertion (iii). 

To obtain (iv) as a consequence of (iii), note that 


1 1 二 f ( 办 - 4) m 

Ck - Z k Ck ~ {z k - Z^) (C/c - 4) m+1 

This series converges for 2 G P r (2 0 ) and C r (z°), since then \zk — 
z k\ < ICa ： ~ z k\ = r k f° r ^ So if we take F r (z°) with P r (z°) C Q, and 
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insert for each k the series in formula ( 2 ) we get f(z) = ^ a a (z — z°) a 
with 

aa= j^ m n (Cfc . 

As a result |a Q | < Mr _Q , where r~ a = r 「 ai r^~ a 2 … r~ Qn , and 

M= sup |/( 0 |. 

C€C r ( 2 o) 

Thus the series converges uniformly and absolutely if 2 G P r /( 2 °) and 
r’ k < rfc, for all A: = 1, • •., n. 

To complete the proof of the proposition, note that (iv) implies (i) 
as follows. If ^2 a a( z ~ z °) a converges absolutely for all 2 ： near 2 0 , we 
can choose a 2 ’ near 2 0 , so that 2 ^. — 2 ^ 7 ^ 0 for each k with 1 < A: < n, 
and thus ^ \a a \p a converges with P HZ (Pi ， … ， Pn)，Pk = \ z， k ~ z k\ 〉 0 . 
Thus for any 2 G P p (z°) we can differentiate the series term by term and 
see that in particular / is in C 1 in that polydisc and satisfies the usual 
Cauchy-Riemann equations there. Since this is valid for each 2 0 G $1, it 
follows that / is of class C l throughout Q and satisfies ( 1 ) in the usual 
sense. A fortiori property (i) holds, and the proof of the proposition is 
concluded. 

Two additional remarks are in order. First, the requirement in (i) 
that / be continuous can be weakened. In particular, if / is merely 
locally integrable and satisfies (i) in the sense of distributions then / can 
be corrected on a set of measure zero so as to become continuous (and 
thus by the above, analytic). 

Second, a more difficult equivalence is that it suffices to have asser¬ 
tion (ii) without the a priori assumption that / be (jointly) continuous. 
See Problem 1 *. 

Another aspect of analysis in C n that is essentially unchanged from 
the case of one variable is the following feature of analytic identity. 

Proposition 1.2 Suppose f and g are a pair of holomorphic functions 
in a region 1 Q, and f and g agree in a neighborhood of a point z° G $1. 
Then f and g agree throughout Q. 

Proof. We may assume that ^ = 0. If we fix any point z f G $1, 
it suffices to prove that f(z f ) = 0. Using the pathwise connectedness 
of Q we can find a sequence of points z l ,... ^ z N = z f in Q and polydiscs 
P rfc ( 2 fc ), for 0 < A: < AT, so that 


1 Recall that a region is defined to be an open and connected set. 
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(a) F rk (z k )cn, 

(b) z k ^~ l E P rfc (z k )^ ior 0 < k < N — 1. 

Now if / vanishes in a neighborhood of z k ^ it must necessarily vanish 
in all of P rfc (z k ). (This little fact is established in Exercise 1.) Thus / 
vanishes in P ro (z°), and by (b)，it vanishes in P rfc+1 (z fc+1 ) if it vanishes 
in P rfc {z k ). Hence, by an induction on k, we arrive at the conclusion 
that the function / vanishes on F rN {z N ), and therefore f(z ; ) = 0, and 
the proposition is proved. 

2 Hartogs’ phenomenon: an example 

As soon as we get past the elementary properties of holomorphic func¬ 
tions of several variables, we find new phenomena for which there are no 
analogs in the case of one variable. This is highlighted by the following 
striking example. 

We let be the region in C n , n > 2 , lying between two concentric 
spheres; take in particular = {z E C n ， p < \z\ < 1}, for some fixed 

0 < p < 1 . 

Theorem 2.1 Suppose F is holomorphic inft = {KC' p<\z\< 1 }, 
for some fixed p ， 0 <p< 1 . Then F can be analytically continued into 
the ball {z E C n : \z\ < 1}. 

Here we give a simple and elementary proof of this. Using more sophis¬ 
ticated arguments we shall see below that this property of “automatic” 
continuation holds under very general circumstances. 

The quick proof we have in mind is based on a primitive example of 
this continuation，which we give in the case of C 2 . Suppose 

Ki = {( 21 , 2 : 2 ) - |^i| < a, and |z 2 | = ^i} 

and 

K 2 = {{z\,Z 2 ) : |^i| = a, and b 2 < I 22 I < h}- 

Lemma 2.2 If the function F is holomorphic in a region O that con¬ 
tains the union K\ U K 2 then F extends analytically to an open set O 
containing the product set 


(3) 


{( 之 1 ，之 2 ) : ki| < a, b 2 < |2 2 | < ^i}. 
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Figure 1 . O contains the shaded region 


See Figure 1 for an illustration of the sets K 2 and their product. 
Proof. Consider the integral 


I (^ 1 ，之 2 )= 




which is well-defined for small positive 6, when ( 21 , 22 ) is in a neighbor¬ 
hood O of the product set (3). In fact then the variable of integration 
ranges over a neighborhood of K2 、 where F is analytic and hence contin¬ 
uous. Moreover /(^i, 22 ) is analytic in C), since it is visibly analytic in z\ 
for fixed 22 when | z \ \ < a + 6, and 22 is near the set 62 < \z 2 \ < 61 ； also it 
is analytic in 22 (for fixed Z\) in that set, by virtue of the analyticity of F. 
Finally when ( 21 ， 22 ) is near the set K\^ then /(^i, Z 2 ) z 二 F(z u z 2 ) by the 
Cauchy integral formula, and thus I provides the desired continuation 
of F. 

We give the proof of the theorem in the case n 二 2， and start when 
p < l/y/ 2 . Here we let K\ — {|^i| < ai, [ 22 ! = ^1} and K2 = {\z\\ = 
ai, 62 < |^2| < 61} with ai =61, p < a\,b\ < l/v^, and 62 = 0 . (See 
Figure 2.) 

Then K\ and K2 both belong to fi, and according to the lemma, F 
continues to the product {\z\\ < l/\/ 2 , 1 ^ 2 ! < l/\/ 2 }, which together 
with ft covers the entire unit ball. 

When l/y/2 < p < 1 , we use the same idea, but now carry out the 
argument by descending in a finite number of steps the staircase in the 
(|zi|, | 2 2 |) plane whose corners are denoted by (a^, (3k)- (See Figure 3.) 

We take (3\ = p , ai = (1 — = (1 — p 2 )" 2 ，and more generally 
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之 2 






N. 


、 


K x 



、 


\ 



P 


z i 


Figure 2. The case where p < 1/V2 


Pk-\-i ~ P 2 — a fc+i ~ ^ ~ Hence 戌 —1 — A:(l — p 2 ), a\ = A:(l — 
P 2 ). 

We start at A: = 1 and stop as soon as 1 — k(l — p 2 ) < 0 for A: = 7V ， 
with N the smallest integer 〉 1/(1 — p 2 ). With this we choose 
so that a k < a k , b k > (3 k with (a k ,b k ) near (a fc ,/3 fc ), yet ajsj = 1 ， b N = 0. 

Now let TZk — {p < \z\ < i}u{N<i ； bk < l^l}- As above, the lemma 
gives a continuation of F into a neighborhood of TZ\. Using the lemma 
again (this time with a = ak, b\ = b^-, 62 = 6 ^+ 1 ) gives a continuation 
of F from a neighborhood of TZk to a neighborhood of TZk+i- Now 
T^n = {\^\ < l}，and so we are done. 

The corresponding argument in dimension > 3 is similar to that of 
n = 2 , and is left to the interested reader to work out. 

We mention one immediate application of the previous theorem: a 
holomorphic function in C n , n > 1 , cannot have an isolated singularity; 
nor can it have an isolated zero. In fact we need only apply Theorem 2.1 
to an appropriate pair of concentric balls, centered at the purported sin¬ 
gularity. The fact that a zero of / cannot be isolated follows from the 
previous conclusion applied to the function 1 //. A more extensive as¬ 
sertion holds, namely if / is holomorphic in Q and vanishes somewhere, 
its zero set must reach the boundary of Q. (See Exercise 4.) Also the 
nature of the zero set of / near a point where / vanishes can be de¬ 
scribed quite precisely by the Weierstrass preparation theorem, discussed 
in Problem 2*. 

Finally, notice that holomorphic functions inside the unit ball {| 之 | < 1 } 
cannot necessarily be extended outside the ball, as the simple example 
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f(z) = l/(:i — 1) shows. In fact, we shall see later that the u convex- 
ity” of the boundary of Q plays a crucial role in determining whether a 
function can be extended past its boundary. 


3 Hartogs’ theorem: the inhomogeneous Cauchy-Riemann 
equations 

Having seen some simple examples of automatic analytic continuation, 
we now come to the general situation. The method that will be used 
here, and that turns out to be useful in a number of questions in com¬ 
plex analysis, is the study of solutions of the system of inhomogeneous 
Cauchy-Riemann equations 

(4) = fj for j = 

dz j 

where the fj are given functions. 

The wide applicability of solutions of these equations results from the 
following necessity. Often one wishes to construct a holomorphic func¬ 
tion F with certain desired properties. A first approximation F\ can be 
found that enjoys these properties, but with that function not usually 
holomorphic. The extent to which it fails to satisfy that requirement is 
given by the non-vanishing of dF\/dzj — fj, for I < j < n. Now if we 
could find an appropriately well-chosen u that solves du/dzj = fj, then 
we could correct our F\ by subtracting u from it. In the case below, the 
“good” choice of u will be the one that has compact support (assuming 
the fj have compact support). 
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In considering (4)，we look first at the one-dimensional case, which is 

(5) = /( 之)， where — \ (羞 + 嗜) an d z = x iy G C 1 . 

One can state right away a solution to this problem. It is given by 

⑹ u(z) = ^ f dm{Q = 1 / dm(0 

^ 7c 1 z ~ C ^ Jc i C 

with dm(Q the Lebesgue measure in C 1 . Alternatively, we can write u = 
/ * $， with $( 2 ;) = 1/(ttz). The precise statement regarding (5) and (6) 
is the following assertion. 

Proposition 3.1 Suppose f is continuous and has compact support on C. 
Then: 

(a) u given by (6) is also continuous and satisfies (5) in the sense of 
distributions. 

(b) If f is in the class C k ， A: > 1, then so is u，and u satisfies (5) in 
the usual sense. 


(c) If u is any C l function of compact support, then u is already of the 
form (6); in fact 


u = 


du 




Proof. Note first that 

u(z + h)-u(z) = lJ c J( Z + h-Q-f(z-0f, 


and that this tends to zero as ft — 0， by the uniform continuity of / 
and the fact that the function 1/C is integrable over compact sets in C 1 . 
If / is in the class fc > 1, an easy elaboration of this shows that we 
can differentiate under the integral sign in (6) and find that any partial 
derivative of u of order < A: is represented in the same way in terms of 
partial derivatives of /. 

Next we use the fact that $( 2 ) = \ is a fundamental solution 
of the operator d/dz. This means that in the sense of distributions 
畜少 = 5o, with 5o the Dirac delta function at the origin. (See Exercise 16 
in Chapter 3.) So using the formalism of distributions, as in Chapter 3, 
we have 


d 


(/*$) = /* 
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The first set of equalities means that du/dz = /, since f 木 = f ， and so 
assertions (a) and (b) are now proved. Using the equality of the second 
and third members above (with u in place of /) gives u = = 

and this is assertion (c). 


When we turn to the inhomogeneous Cauchy-Riemann equations (5) 
for n > 2, there is an immediate difference that is obvious: the fj’s cannot 
be given “arbitrarily” but must satisfy a necessary consistency condition 


⑺ 


dfj_ = dfk 

dzk dzj 


for all 1 < j，k < n. 


Moreover, it turns out that now the assumption that the fj have compact 
support implies the existence of a solution of compact support. The result 
is contained in the following proposition. 

Proposition 3.2 Suppose n >2. Iffj, 1 < j <n, are functions of class 
C k of compact support that satisfy (7), then there exists a function u 
of class C k and of compact support that satisfies the inhomogeneous 
Cauchy-Riemann equations (4). 2 

Proof. Write z = (z\ z n )^ where 〆 =( 2 ： i,... ， 2 ： n -i) ^ C n-1 and set 

⑻ u{z) = — f f n (z , Z n — C) ― 

^ 7c 1 C 


Then by the previous proposition du/dz n = f n . However by differenti¬ 
ating under the integral sign (which is easily justified) we see that for 

1 ^ j ^ ^ — 1 


du 



9fn f , 广、 

瓦 _0 

— Z n — C) 


dm{Q 

c 

dm{Q 


~ fj ? z n ) - 


The next-to-last step results from the consistency condition (7), and the 
last step is a consequence of part (c) of Proposition 3.1. Therefore u 
solves (4). 

Next, since the fj have compact support, there is a fixed R, so that 
the fj vanish when \z\ > R for all j. Thus by Proposition 1.1, u is 
holomorphic in so by (8), u also vanishes there. Since the latter 


2 In the case fc = 0, the identities (7) and (4) are taken in the sense of distributions. 
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is an open subset of the connected set \z\ > /?, Proposition 1.2 implies 
that u vanishes when \z\ > R, and all our assertions are proved. 

A few remarks may help clarify the nature of the solutions provided 
by the previous propositions. 

• As opposed to the higher-dimensional case, when n = 1 it is not 
possible in general to solve (4) with a function u of compact sup¬ 
port, given / of compact support. In fact it is easily seen that 
a necessary condition for the existence of such a solution is that 
/ C i f{z)dm{z) ：= 0. The full necessary and sufficient conditions are 
described in Exercise 7. 


• When n > 2, the solution given by (8) is the unique solution which 
has compact support. This is evident because the difference of 
two solutions is a holomorphic function on all of C n . Similarly, 
when n = 1， the solution u given by (6) is the unique one for which 
u(z) 0, as | 2 ：| ^ oo. 

The simple facts that we have proved about solutions of the inhomo¬ 
geneous Cauchy-Riemann equations in the whole space C n allow us to 
obtain a general form of Hartog’s principle illustrated by Theorem 2.1. 
This can be formulated as follows. 


Theorem 3.3 Suppose ft is a bounded region in C n ; n > 2, and K is a 
compact subset offl such that fl — K is connected. Then any function Fo 
analytic in fl — K has an analytic continuation into fl. 

This means that there is an analytic function F on fi, so that F = F 0 
on — K. 


To prove the theorem observe first that there exists e > 0, so that 
the open set O e = {z : d( 2 ：， fi c ) < e} is at a positive distance from K. 
Note that then (fi fl O e ) C (fi — K). Next we can construct a C°° cut-off 
function 3 rj so that r](z) = 0 for 2 ： in a neighborhood of K, while rj(z) = 1 
for z E O e . With this function we define F\ in fl by 



r]{z)F 0 {z) 

0 


for z £ ft — K 
for z G K. 


The function F\ is C°° in fl. While F\ gives an extension to of Fo, 
this extension is of course not analytic. But by how much does it fail to 
have this property? To answer this, we define fj by 

dF x 


⑼ 


fj 




for 


1, … ， n. 


3 Note that C 2 instead of C°° would do for the rest of this proof. 
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Note that the fj are C°° functions in fi, and automatically satisfy 
the consistency conditions (7) there. Moreover the fj vanish near the 
boundary of fl (in particular for 2 ： E fl fi) because of the analyticity 
of Fq. Thus the fj can be extended to be zero outside so that now 
the extended fj are C°° and satisfy (7) in the whole of C n . We call the 
extended fj by the same name. We now correct the error given by (9) 
using Proposition 3.2 to find a function u of compact support so that 
du/dzj = fj for all j, and take F = F\ — u. 

Note that F is holomorphic in ft (since dF/dzj = 0, 1 < j < n, there). 
We will next see that F agrees with F 0 in an appropriate open subset of 
— X, which is the same as saying that u vanishes in that open set. 

To describe the open set in question we find the smallest R so that 
^ C {|^| < R}. Then clearly there is a : 0 G with |^°| = R. We set 
B e 二 B e (z°) = {z : \z — z°\ < €}, and will see that ftD B e is an open set 
in fl — K where u vanishes. (See Figure 4.) 



Figure 4. The function u vanishes in fin B e 


The fact that fl is an open non-empty set in — X is immedi¬ 
ate since B e c O e and hence B e is disjoint from K\ also if H were 
empty, z° could not be a boundary point of ft. In addition, u is holomor- 
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phic in B e (more generally in O e ), since the fj vanish there. Moreover, 
u is zero in {\z\ > R} since u is analytic there, this set is connected, 
and u vanishes outside a compact set. Finally, B e D {\ ^ > R} is clearly 
a non-empty open set of B e . Therefore u vanishes throughout B e and 
in particular in ft C\ B e . This shows that F and Fq agree on an open set 
of ft — K, and since the latter set is connected, they agree throughout 
Q, ~ K. The theorem is therefore proved. 


4 A boundary version: the tangential Cauchy-Riemann 
equations 

We have just seen that if a holomorphic function Fo is given in a (con¬ 
nected) neighborhood of the boundary of a region ft in C n , n > 2, then it 
extends to the whole region. Since the neighborhood on which Fo is given 
can in principle be arbitrarily narrow, it is natural to ask what happens 
in the limiting situation where Fq is given only on the boundary dfl of fl. 
To answer this we must answer the question: what functions Fo given 
only on extend to holomorphic functions in all of fi? 

We shall formulate this problem precisely and solve it in the context of 
regions with sufficiently smooth boundaries. We begin by reviewing the 
relevant definitions and elementary background facts that are needed for 
this. 

We start in the setting of R d and later pass to C n by identifying the 
latter space with the former when d = 2n. Now suppose we are given a 
region in R d . A defining function p of is a real-valued function 
on so that 

{ p(x) < 0, when x E fi, 
p(x) = 0, when x E 
p(x) > 0, when x E 0°. 

For any integer fc > 1 the boundary of is said to be of class C k if ft 
has a defining function p which satisfies 

• p E C k (M d ); 

• |VP ㈨ I > 0, whenever x E dft. 

The boundary dft is an example of a hypersurface of class C k • More 
generally we shall say that M is a (local) hypersurface of class C k if 
there is a real-valued C k function p, defined on a ball B C so that 
M = {x £ B : p(x) = 0}, and |Vp(x)| > 0 whenever x G M. 

For a region whose boundary is of class C k one knows that near 
any boundary point dft can be realized as a “graph.” More precisely, 
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fixing any point of reference x° G dQ and making an appropriate affine- 
linear change of coordinates (in fact a translation and rotation of R d ) 
then, by the implicit function theorem, we can achieve the following: 
With the new coordinate system written as x = (x^Xd) where x l E R d — 1 
and Xd G R, the initial reference point x° corresponds to (0,0) and near 
: e 0 二 （ 0,0) the region ft and its boundary are given by 

Q : x d > 
dft : Xd = 

Here p is a function defined near the origin in R d_1 . We can also ar¬ 
range matters so that (in addition to p(0) = 0), one has V x /((^)(x / )| x / = o — 
0, which means that the tangent plane to at the origin is the hyper¬ 
plane Xd = 0. (See Figure 5.) 


( 10 ) 



In this coordinate system, because p(x f ^ = 0, we have 

p{x) = pix'.Xd) - p{x ， ,if{x ， )) 

/* 1 Q 

=J —p(x ; , tx d + (1 - t)^)) dt 
= ( 咖 ’）_ x d )a(x), 

with a(x) = — Jq 1 txd + (1 — In other words, p(x)= 

a(x)(ip(x , ) — Xd), where a is a C k ~ l function. Also a(x) > 0 if x is suffi¬ 
ciently close to the reference point x°, since then < 0, in view of the 
fact that d/dxd points “inwards” with respect to fi. 

Now suppose p is another C k defining function for ft. Then near x° 
we again have p(x) = a(x)((p(x f ) — Xd) and thus 

(11) p 二 cp 、 where c(x) > 0 
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and c is of class C k ~ l . 

Next we recall that a vector field X on can be viewed as a first-order 
linear differential operator of the form 


3 = 1 3 

with (ai (x) 7 a 2 (^),..., dd{x)) the “vector” corresponding to the point 
x G This vector field is tangential at if 


d 


X{p) = ^2aj{x)^ = 0, 


3 = 


dxj 


whenever x G dQ. 


Because of (11) and Leibnitz’s rule, this definition does not depend on 
the choice of the defining function of 

Next we fix an £ with £ < k. Then, any function /o defined on dQ is 
said to be of class C e if there is an extension / of /o to so that / is 
of class C e on R d . Now if X is a tangential vector field and / and f f are 
any two extensions of /o, then as is easily seen X(f)\gQ = X(f f )\gQ. (See 
Exercise 8.) So in this sense we may speak of the action of a tangential 
vector field on functions defined only on dQ. 

We now pass to the complex space C n that we identify with R d , d = 2n. 
We do this by writing 2 ； 6 C n , z = ( 2 : 1 ,, z n ), Zj = Xj iyj, 1 < j < n ， 
and then setting x = (xi, •. . ， X 2 n ) ^ K 2n with Xj, 1 < j < n, as before, 
and Xj+ n = yj, for I < j < n. Vector fields on C n can now be written as 


(Here it is necessary to allow the coefficients to be complex-valued.) Such 
a vector field is called a Cauchy-Riemann vector field, if bj = 0 for 
all j, that is, X is of the form 

x = fy 糾長 . 

j=i 3 

Equivalently, X is a Cauchy-Riemann vector field if it annihilates all 
holomorphic functions. 
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Given a region ft (with C k boundary) then the above Cauchy-Riemann 
vector field X is tangential if 

n 

^2a j {z)p j (z) = 0, where pj(z) = ■§§：. 

3=1 


Now near any fixed z° G at least one of the pj(z°) ) 1 < j < n, must 
be non-zero, since |Vp( 2 ；°)| > 0; for simplicity we may assume j = n. 
Then the n — 1 vector fields 


( 12 ) 


d d 
Pn d^~ ~ Pj d^ 


1 < J < ^ — 1 


are linearly independent and span the tangential Cauchy-Riemann vector 
fields near z° (up to multiplication by functions). 

Without making the particular choice j = n one notes that the n(n — 
1)/2 vector fields 


(13) 


Pk 




<j<k< 


span the tangential Cauchy-Riemann vector fields (globally), but of course 
are not linearly independent. 

There is a way of expressing this neatly by using the language of dif¬ 
ferential forms. Suppose u is a, complex-valued function. Then we can 
abbreviate the equations 普 = /j, for 1 < j < n, by 


du = /, 


with du and / the “one-forms” 4 defined by Y^Jj=\ 磬庇 j and fjdzj } 
respectively. Now for any one-form w — Wjdzj^ we define the two- 

form dw by 


n 


dw : = A dzj 
■7 = 1 

dwj 


E 

l<fc,j<n 

E 

l<k<j<n 


dz k 



dzk A dzj 


dw k 




dzk A dzj^ 


4 More precisely, (0, l)-forms. 



292 


Chapter 7. A GLIMPSE INTO SEVERAL COMPLEX VARIABLES 


since dzk A dzj = —dzj A dzk i n this formalism. 

With this notation the inhomogeneous Cauchy-Riemann equations (4) 
can be written as du — /, and the consistency condition (7) is the same as 
df = 0. Moreover a function Fo is annihilated by the tangential Cauchy- 
Riemann vector fields ((12) or (13)) exactly when 

(14) dF 0 A dp\ d Q = 0. 

So whenever Fq is the restriction to dft of a function of class C l (fi) that 
is holomorphic in fi, it must satisfy these tangential Cauchy-Riemann 
equations. The remarkable fact is that, broadly speaking, the converse 
of this holds. This is the thrust of Bochner’s theorem. 


Theorem 4.1 Assume ft is a bounded region in C n ，whose boundary is 
of class C 3 ， and suppose the complement of ft is connected. If Fq is a 
function of class C 3 on dfl that satisfies the tangential Cauchy-Riemann 
equations, then there is a holomorphic function F in fl that is continuous 
in so that F\qq = Fo- 

The fact that some connectedness property is required for both this and 
the previous theorem can be seen in Exercise 10. 

The proof of this theorem is in the same spirit as the previous one, 
but the details are different. The function Fo of class C 3 (9fi) can, by 
definition, be thought of as a function of class C 3 on the whole space. 
Now Fq satisfies the tangential Cauchy-Riemann equations, and we can 
modify it (without changing its restriction to so that the modified 
function F\ is of class C 2 and 


(15) dFi\dQ = 0. 

This modification is achieved by taking F\ = Fq — ap, where a is a suit¬ 
able C 2 function. Indeed, F\ already satisfies the tangential Cauchy- 
Riemann equations. An independent Cauchy-Riemann vector field (that 
is not tangential) is given by TV, with 


n 


N(f)=jy 3 


3 


d[_ 

dzj 


In fact, we note that 


n 


N(P) = J2 


3 


dp 


2 


|Vp| 2 > 0. 


4 
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Thus if we set a = N(Fo)/N(p) near the boundary of and extend a 
strictly away from the boundary to be zero, then (15) is achieved because 
of (14). _ 

We now define the one-form / in by / = dF \. Then / is continuous 
on is of class vanishes on and satisfies df = 0 in the inte¬ 

rior of We can now extend / to C n (keeping the same name) so that 
/ = 0 outside of Q. Then / satisfies df = 0 in C n (at least in the sense 
of distributions). This would be evident if we supposed that Fq and dQ 
were of class C 4 instead of class C 3 . In the latter case an additional 
argument is needed (see Exercise 6 in Chapter 3). We can now invoke 
Proposition 3.2 to obtain a continuous function u so that du = f and 
moreover u has compact support. Since u is holomorphic on Q and this 
set is connected, it follows that u vanishes throughout Q and by conti¬ 
nuity it vanishes on dQ. Finally, take F 二 F\ — u, then F is holomorphic 
in fi, continuous in and F\oq = 巧 | 如 =Fo|aa, completing the proof 
of the theorem. 


In the case n = 1， there are no tangential Cauchy-Riemann equations 
and the conditions on F 0 are global in nature. See Exercise 12. 

By a different argument one can reduce the degree of regularity in¬ 
volved on Fq. See Problem 3*. 

Given the nature of the conditions that are sufficient when n > 1, it 
is natural to ask if there is in fact a “local” version of the extension 
theorem just proved. For this to be possible, the formulation of such a 
result must distinguish on which “side” of the boundary this continuation 
holds. The example of the “inside” of the sphere, where continuation 
takes place as opposed to the “outside” where it fails, suggests that a 
convexity property might be involved. This is indeed the case because of 
the complex structure of C n ，as we will see when we examine the local 
nature of the boundary of a region. 

5 The Levi form 

Let us briefly glance back to the situation in We will see that near 
any boundary point x° the region Q can be put in a very simple canonical 
form. We already noted earlier that near x°, in the appropriate coordi¬ 
nates, we can represent Q as {xd > Now if we introduce new co¬ 

ordinates (xi,X 2 ,...,Xd), by Xd = Xd — (^(^), Xj = Xj, 1 < j < d (with 
inverse Xd — Xd^r Xj = Xj, I < j < d) we obtain that locally is 

now represented by the half-space Xd > 0, and dQ by the hyperplane 
= 0 . 

However to be applicable to the study of holomorphic functions in 
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C n , the new coordinates that we can allow (that is, the change of vari¬ 
ables that is permissible) must be given by holomorphic functions, so 
our choices are more restricted. The coordinates that result from such 
changes of variables (starting with the standard coordinates about a fixed 
point z°) will be called holomorphic coordinates. Here we assume 
that dft is of class C 2 , and use the notation zj = Xj + iyj. 

Proposition 5.1 Near any point z° E dQ we can introduce holomorphic 
coordinates ( 2 : 1 ,, z n ) centered at z° so that 

n— 1 

(16) = (Im( 2 ： n ) > ^j\ z j\ 2 + E{z)}. 

i=i 

Here the Xj are real numbers, and E(z) = + Dxl + o(\z\ 2 ), as 

z ^ 0; 5 also £(z’）is a linear function of ..., x n _i, ?/i,... ， y n -\, and 
D is a real number. 

A few remarks may help to clarify the nature of the cannonical represen¬ 
tation (16). 

• By making a further change of scale Zj —^ SjZj^ 8j ^ we can set 
the \j to be either 1, —1 or 0. 

• The number of \j that are positive, negative or zero (the signature 
of the quadratic form) is a holomorphic invariant as we will see 
below. 

• It can be seen from (16) that it is natural to assign the variables 
2 ： i,..., z n -i “weight 1” and the variable z n “weight 2，’’ which, dis¬ 
regarding the error term, makes the expression homogeneous of 
weight 2. This homogeneous version of (16) gives us the “half- 
space” W that we consider further in the Appendix to this chapter. 

• If we had assumed that dft was of class C 3 , then the error estimate 
o(|:| 2 ) would be improved to 0(|:| 3 )，as 2 ： —^ 0. 

Proof of the proposition. As in (10), we see that we can introduce 
complex coordinates (with an affine complex linear change of variables) 
so that near z° the set ft is given by 

lm(z n ) > (f{z\x n ) 


5 f(z) = o(\z\ 2 ) as 2 ― ► 0 means that \f(z)\/\z\ 2 —»• 0, as \z\ —»• 0. 





5. The Levi form 


295 


with 2 ： = [z\ z n ), z 1 = (^i,..., ^ n _i), and Zj = Xj + iyj. We can also 
arrange matters so that p(0,0) = 0 and 

合1_ = 心1_ = ^一)， 

Using Taylor’s expansion of (p at the origin up to order 2 we see that 

+ 〉: (3jk z j z k + [z 1 )-[- 

l<ji,fc<n —1 

+ Dx^ + o(\z\ 2 ), as z —> 0. 

Here f3jk = Pkj and t is a (real) linear function of the variables Xi,..., 
x n -\ and ?/i,..., 2/ n —l, with D a real number. 

Next we introduce the (global) holomorphic change of coordinates 
Cn ~ _ 2z 一 i ki and (fc 二之 fc ， for 1 ^ h ^ ti — 1. Then 

Im(Cn) = Im(^ n ) - Yh<j^k<n-\i a 3 ^ z 3 z ^ ^~^jkZjZ k ), and thus in these 
new coordinates (where we immediately relabel the (’s as ?s) the func¬ 
tion (p becomes Di<j,fc<n-i + x n i\z') + + o(\z\ 2 ). 

Next, a unitary mapping (in the , z n -\ variables) allows us to 

diagonalize the Hermitian form and (p becomes 

n— 1 

(17) A )N 2 + {- o{\z\ 2 ) 

i=i 


with 入 丄， .… ， 入 n _i，the eigenvalues of the quadratic form. This proves 
the proposition. 


The Hermitian matrix < ^that appears implicitly above, 

l az J aZk ) l<j,k<n-l 

or its diagonalized version the form in (16) ， 入 jbl 2 ， 论 referred to 
as the Levi form of (at the boundary point z°.) A more intrinsic def¬ 
inition comes about by noticing that the vectors d/&Zj, 1 < j < n — 1, 


are tangent to dil at zq. If p(z) = (f{z f ， x n ) — y n , then the corresponding 
quadratic form is 


(18) 


E 

l<j,k<n 


d 2 p _ 
dzjd^k°^ ak ’ 


restricted to the vectors akd/d~Zk that are tangential at z°. Note 

also that these tangent vectors form a complex subspace (of complex 
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dimension n — 1) of the full tangent space (which has real dimension 
2n — 1). 

Now let p' be another defining function for ft. Then p' — cp with c > 0, 
and we assume c is of class C 2 . Then by Leibniz’s rule, 

h d^M~ k ajak ~ c ^ d^~ k ajak 

since p = 0 there, and also a k~§^ — 0 because 1S tan¬ 

gential. Thus the signature of the form (18) is independent of the choice 
of defining function. 

Finally let 2 ： i—^ 少 ( 之 ) =wbea, biholomorphic mapping defined near the 
origin (with $(0) = 0)，giving us a new holomorphic coordinate system 
(wi ,..., w n ) in the neighborhood of z°. Then by holomorphicity the 
differential of ^ maps tangent vectors at z° of the form 

tangent vectors of the form X^=i a fc Now if p’ is a defining function 
of then — p n {z) is another defining function of ft near z° 

and we can conclude by the above that the signature of (18) is invariant 
under holomorphic bijections. 

With regard to the above, one says that a boundary point z° E dQ is 
pseudo-convex if the Levi form is non-negative, and strongly pseudo- 
convex if that form is strictly positive definite. A region fi is pseudo- 
convex if every boundary point of Q has this property. 

A good illustration is given by the unit ball {| 之 | < 1}. If we take 
p(z) = \z\ 2 — 1 to be its defining function, we see that at every boundary 
point the Levi form corresponds to the identity matrix, and hence the 
unit ball is strongly pseudo-convex. 

Pseudo-convexity may be thought of as the complex analytic analog for 
n > 1 of the standard (real) convexity in for the latter see Exercise 26 
in Chapter 3 and the problems in Chapter 3 of Book III. The nature of 
the Levi form at z° turns out to have important implications for the 
behavior of holomorphic functions defined in Q near z°. In particular, 
we shall next see some interesting consequences that follow if one of the 
eigenvalues of the Levi form is strictly positive. 


6 A maximum principle 

A noteworthy implication of the partial positivity of the Levi form is the 
following “local” maximum principle in C n , n > 2, which has no analog 
in the case n = 1. 
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Suppose we are given a region Q with boundary of class C 2 ，and B 
is an open ball centered at some point z° E dft. Assume that at each 
2 ： E dQ D B ai least one eigenvalue of the Levi form is strictly positive. 

Theorem 6.1 In the above circumstances there exists a (smaller) ball 
B' C B, centered at z° y so that whenever F is a holomorphic function 
on Qn B that is continuous on QC\ then 


(19) sup \F(z)\ < sup |F( 2 ：)|. 

zeQnB f z£dQnB 


A counter-example of assertion (19) in the case n = 1 is outlined in Ex¬ 
ercise 16. 

Proof. We consider first the special situation when : 0 = 0 and is 
given in the canonical form (16). We may assume that Ai > 0. 

We write z = (zi^ z”, z n ), where = ( 之 2 , • •. ，之 n-i) G C n — 2 ，and we 
consider points of the form (0,0, iy n ). We denote by B — B r the ball 
of radius r centered at the origin and prove that whenever 0 < y n < 
cr 2 , with r sufficiently small, then at these special points we have the 
preliminary conclusion 

(20) \F{0,0,iy n )\ < sup \F(z)\. 

zed^inB r 

Here c is a constant to be chosen below (c = min(l, Ai/2) will do). 

This will be proved by considering the complex one-dimensional slice 
passing through the point (0,0, iy n ). Indeed, let = {z\ : ( 2 ： i, 0, iy n ) E 
It is obvious that is an open set containing the point 
(0,0, iy n )- We note the following key fact: if r is sufficiently small, then 

(21) If z\ E dQi then ( 2 : 1 ,0, iy n ) G dQ fi B r . 

Indeed, if z\ is on the boundary of the slice fii, then either (z \, 0, iy n ) is 
on the boundary of or {zi,0,iy n ) is on the boundary of B r (or both 
alternatives hold). In fact the second alternative is not possible, because 
if it held, then it would imply that \z\ | 2 - — T<1 - Since y n < cr 2 this 
yields \z\ | 2 > r 2 — c 2 r 4 > 3r 2 /4, if we take c < 1 and t < 1/2. Moreover 
since any such point must be in Q we must have that y n > X\\z\\ 2 + 
o(\z\ | 2 ) and therefore cr 2 > Ai3r 2 /4 + o(r 2 ), which is not possible if we 
take c < Ai/2 and r is sufficiently small. Since now the second alternative 
has been ruled out, we have established (21). 

Now for y n fixed, we define f{z\) = F(zi,0, iy n )> Then / is a holo- 
morphic function in z\ on the slice and is continuous on Since 
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0 € fii, the usual maximum principle implies 

|F(0,0,iy n )| = |/(0)| < sup 1/(2 ^)! 二 sup |/(2 ： i)| < sup \F(z)\, 

z\ G^i 2i z^dQ.nB r 

because of (21). Therefore the claim in (20) is established. 

We will pass from this particular estimate to the general situation by 
showing that for every point z E ： ft sufficiently close to the boundary 
of fi, we can find an appropriate coordinate system so that with respect 
to it the point : is given by (0,0, iy n ), and thus the conclusion (20) holds 
for 2 ：. This is done as follows. 

First, for every point z G Q sufficiently close to dfl there is a (unique) 
point tt(z) G dCt which is nearest to 2 ： and moreover, the vector from 7r(z) 
to 2 ： is perpendicular to the tangent plane at Now at each 7r(z) E dfl 
we can introduce a coordinate system leading to the description (17) of 
near We also observe that the mapping from the initial ambient 

coordinates of C n to those appearing in (17) is affine linear and preserves 
Euclidean distances. Because of the orthgonality of the vector from ir(z) 
to 2 ： to the tangent plane, the point 2 ： has coordinates (0, 0, iy n ) in this 
coordinate system, and in fact \z — ^{z)\ = y n . 

With B the initial ball centered at z°, we will define B f = Bs(z°) to 
be the ball of radius S centered at z°. That radius will be determined by 
another radius r, so that S = c*r 2 , with the constant c* specified below. 
We will have 0 < c* < 1, and ultimately take r (and hence S) sufficiently 
small. 

We can assume that Ai is the largest eigenvalue appearing in (17) and 
since is of class C\ the quantity Ai varies continuously with the base 
point 丌 ( 之 ) • We denote by A* the infimum of these Ai, and in parallel 
with the special case treated above we set c* = min(l, A*/2). 

We then note that if z E H Bs and we take r sufficiently small, then: 

• \z — 7r(2：)| < 5, and; 

• B r { / K{z)) C B. 

In fact if 2 ： E Bs(z 0 )^ then z° G dfl implies that d(z^ dft) < 5, which 
gives fz — 7r(2：)| < 6. 

Secondly 

C - ^°| < 1C - 7T(:)I + I 丌 ⑷ - :| + I: - : 0 |， 

so if C ^ B r (7r(z)), then — tt(z)\ < r while \z — 7r(2：)| < 5， and \z — 
z°\ < S (since 2 ： E Bs)- This means that |C — : 0 | < r -h 25, and hence 
C E if r (and then 6 = c^r 2 ) are sufficiently small. 
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We can now return to the argument leading to the proof of the special 
case (20). With the ball B t { / k{z)) playing the role of B r above, we see 
as before that we obtain (20) by the maximum principle, because for 
2 ： G Pi B we have y n > A*|:i| 2 + o(| 2 ：| 2 ), z\ 0 with an “o” term that 
is uniform as z (and hence 7r(z)) varies. (This uniformity is a consequence 
of the fact that the corresponding “o” term in the Taylor development 
of (f in (17) is uniform, by virtue of the fact that (p is of class C 2 .) 

All this shows that if we take r sufficiently small, and 5 = c*r 2 , then 
for 2 ： E Bs(zo) = B ; the conclusion of the theorem holds. 

The implication of the theorem, and its proof, are valid in a more gen¬ 
eral setting where the boundary dfl is replaced by a local hypersurface. 
This can be formulated as follows. 

Suppose M is a local C 2 hypersurface given in a ball B with a defining 
function p, so that M = {z G B : p(z) = 0}. Set — {z £ B : p(z) < 
0 }. 

Corollary 6.2 Suppose the Levi form, as given by (18 )， has at least one 
strictly positive eigenvalue for each z E M. Under these circumstances ， 
for every z° G M there is a ball B’ centered at z° so that whenever F is 
holomorphic in and continuous in U M we have 

(22) sup \F(z)\ < sup \F(z)\. 

z£€l-nB , z£M 

The theorem we have just proved tells us that when an eigenvalue of 
the Levi form is positive, the control of the restriction of a holomorphic 
function to a small piece of the boundary gives us a corresponding control 
of the function in an interior region. This is a strong hint that for such 
boundaries a local version of Bochner’s theorem (Theorem 4.1) should 
be valid. Our proof of this will be based on a remarkable extension of 
the Weierstrass approximation theorem, to which we now turn. 


7 Approximation and extension theorems 

The classical Weierstrass approximation theorem can be restated to as¬ 
sert: given a continuous function / on a compact segment of the real 
axis in C 1 , then / can be uniformly approximated by polynomials in 
z = x iy. The general question we will deal with is as follows. Suppose 
M is a (local) hypersurface in C n . Given a continuous function F on M, 
can F be approximated on M by polynomials Pe in 之 1 ，《 2 , ..., 之 n? 

Note that if n > 1, the restriction to M of each Pf> necessarily satisfies 
the tangential Cauchy-Riemann equations, and so F would necessarily 
have to satisfy these equations in at least some “weak” sense. We shall 
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now see that this necessary condition is indeed sufficient. That is the 
thrust of the Baouendi-Treves approximation theorem stated below. 

We suppose we are given a C 2 local hypersurface M in C n , defined 
near z° E M, which after a complex affine-linear change of coordinates, 
the point z° has been brought to the origin and M is represented near 
2 ： 0 as a graph 

(23) A4 — { 2 ； — ( 2 ； , z n ) 1 Im(^ n ) — c n )}. 

If we set p( 2 ：) = (f[z’ ， x n ) — y n , with y n = lm(z n ), the tangential Cauchy- 
Riemann vector fields are spanned by 

d d 

Pn^=~ - Pjf ， 1 < j < 几 —1 ， 

azj az n 

with pj = dp/dzj^ and in particular p n = —0，where we define 

Lp Xn — d(f/dx n . Thus we can write the corresponding tangential Cauchy- 
Riemann equations as 

Ljif) = 0, 1 < j < n - 1, 


with 


(24) 


Lj{f)= 备 — where a 3 = p 3 /p n 


In the coordinates ( 2 : , , x n ) on M, these become Lj(f)= 普 - 
Next, we define the transpose of Lj, namely, L;，by 




so that 


dz r dx^ 




C n_1 xR 


fL^^p) dz f dx^ 


whenever both / and ^ are C 1 functions, with one of them having com¬ 
pact support. (We use the shorthand dz f dx n to designate Lebesgue mea¬ 
sure on C n_1 x M.) In view of the above we say that a continuous func¬ 
tion / satisfies the tangential Cauchy-Riemann equations in the weak 
sense if 


⑷ dz f dx = 0 

1 xM 
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for all ^ that are in C 1 and whose support is sufficiently small. Our 
theorem is then as follows: 

Theorem 7.1 Suppose M C C n is a hypersurface of class C 2 as above. 
Given a point z° E M, there are open balls B’ and B, centered at z°, with 
B f C B, so that: if F is a continuous function in M H B that satisfies 
the tangential Cauchy-Riemann equations in the weak sense, then F can 
be uniformly approximated on M H B’ by polynomials in . ^z n . 

Two remarks may help to clarify the nature of the conclusion asserted 
above. 


• The theorem holds for all n > 1. In the case n — 1 there are of 
course no tangential Cauchy-Riemann equations so the conclusion 
is valid without further assumptions on F. Note however that 
in general the scope of this theorem must be local in nature. A 
simple illustration of this arises already when n = 1 and M is the 
boundary of the unit disc. See also Exercise 12. 

• Note that for n > 1, there are no requirements on a Levi form 
related to M. 

Proof. We shall first take B small enough so that in the hy¬ 
persurface M has been represented by M = {y n = ^f(z\ x n )} where z° 
corresponds to the origin. Besides w(0,0) = 0, we can also suppose that 
the partial derivatives 1 < j < n, and 祭 ， 1 < j < n — 1, vanish at 
the origin. 

Now for each u E M n_1 , sufficiently close to the origin we define the 
slice M u of M to be the n-dimensional sub-manifold given by 

M u = {z : y n = (p(z\x n ), with z f = x f ~iu}. 

We let $ = be the mapping identifying the neighborhood of the 
origin W 1 with M u given by $(x) = {x 1 + x n + i^p{x f + iu, x n )) with 
x = (x f ^ x n ) E IR n_1 x R 二 R n . Observe that M is fibered by the collec¬ 
tion {M u } u . Now for fixed the Jacobian of the mapping x ^ ^(x), 
that is, is the complex n x n matrix given by / + A(x) ) where the en¬ 
tries of A(x) are zero, except in the last row, and in that row we have the 

vector ( 《祭，《盖， … ，《患 ) .So 4⑼= 0, and det(|f) = 

We shall need to shrink the ball B further so that ||^4(x)|| < 1/2, on this 
ball, where || • || denotes the matrix-norm. 

Now with u fixed, the map $ carries the Lebesgue measure on R n to 
a measure (with complex density) dm u (z) = J{x) dx on M u defined by 

/ f(z)dm u (z)= 

Jm u 




dx, where J (x) = det ( d 
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for every continuous function / with sufficiently small support. 

Next take B’ any ball with the same center as B but strictly interior 
to it. Define x to be a smooth (say C 1 ) cut-off function which is 1 on 
a neighborhood of B\ and vanishes when x ^ B. With this, define for 
each u E R n_1 (close to the origin), and e > 0, the function by 

(25) F^(C) = J m e-^ {z - c)2 F(z)x{z) dm u {z). 

Here we use the shorthand w 2 = if w = , w n ) E C n . 

We should remark at this point that, like the classical approximation 
theorem, the argument below comes down to the fact that the functions 
e -n/2 e -^-x 2 f omi an “approximation to the identity” in R n . 6 
The have the following three properties: 

(i) Each is an entire function of C ^ C n . 

——/ 

(ii) Whenever ^ E M u and ( G B ， the F^(Q converge uniformly to 

3-S 6 - > 0. 

(iii) For each lim e —o ^ e w (C) — F e 0 (() = 0, uniformly for ^ E 

The first property is clear, since e _ ?( 2_ () 2 is an entire function in 
and the integration in ^ is taken over a compact set. 

For the second property note that ^ E M U1 and ( = $ + irj £ M u , if 
^ == and C — 少 (《)，with ^ = ^ u . Therefore 

2 

卜 一 c ) 2 = mx) - +o(ix-^i 3 ) 

= ((I^A(0)(x-0) 2 ^O(\x-e)^ 

Now making our initial ball B smaller if necessary (which of course de¬ 
creases the size of B ’），we can guarantee that whenever ^ and C are 
in B 

(26) Re( 2 ； — Q 2 > c\x — ^| 2 , c > 0, 

once we take into account that ||^4(^)|| < 1/2. Thus the exponential ap¬ 
pearing in (25) can be written as e _ ?(( 1+ 乂 ⑹ ）( x_ 《)) 2 + O e ^ ^ 

Thus F^(Q — I -//, with 

I = e _n/2 f f(x) dx 

JR 71 


6 For the classical theorem, see for instance Theorem 1.13 in Chapter 5 of Book I• 
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and 

II = o(e- n/2 ( ^e- c，lvl2/e dv 

\ JR n 6 

with f(x) — F(^(x))x(^(x)) det (/ -f A(x)) ) where / -f A(x). Now 
after a change of variables v = x — ^ the first integral is handled by the 
following observation. 

Lemma 7.2 If A is an n x n complex matrix with constant coefficients 
and \\A\\ < 1 then for every e > 0 



(27) 


e 71 / 2 


det(I-{-A) / e~^ i(I+A)v)2 dv 


Corollary 7.3 If f is a continuous function of compact support, then 


det % A) f e-^^) 2 f^ + v )dv^m 

e 7 JR n 

uniformly in ^ as e ^ 0. 

To prove the lemma note that Re(((/ + A)v) 2 ) > \v\ 2 — ||^4|||i;| 2 > c\v\ 2 ^ 
with c > 0, so that the integral in (27) converges. A change of scale 
reduces the identity to the case e = 1. Now if A is real, a further change 
of variables v f = (I + A)v (which is invertible since \\A\\ < 1) reduces this 
case to the standard Gaussian integral. Finally, we pass to the general 
situation by analytic continuation, noting that the left-hand side of (27) 
is holomorphic in the entries of ^4, whenever ||^4|| < 1. The corollary then 
follows from the usual arguments about approximations of the identity as 
in Section 4, Chapter 2 in Book I and Section 2 in Chapter 3 of Book III. 

Now the term II is dominated by a multiple of f Rn e l ^ 2 \v\^e~ c l v l 2 dv = 
ce 1 〆 2 , as is seen by a change of scale. Thus property (ii) is proved. 

Up to this point, we have not used the fact that F satisfies the tangen¬ 
tial Cauchy-Riemann equations. It is in the proof of property (iii) that 
this is crucial. We begin by considering the case where F is assumed to 
be in class C l . Later we will see how to lift this restriction. We recall 
that the tangential Cauchy-Riemann vector field Lj is given by (24). 

Lemma 7.4 Suppose f is a C 1 function on M. Then 

(28) 备人 j f(z)dm u (z)) = ^ J L Af ) dm “ z )， 


for all 1 < j < n — 1. 
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Proof. Recall that $(x) = = (x f + m, x n + + m, x n )) and 

from before we have that det ( 瓷 ) =1 + i(f Xn . Also, recall that p(z)= 
^p{z l , x n ) — y n) hence, for 1 < j < n — 1, one has 


L 3 = 


and therefore 


d 


Pj 


d 


j Pn dZj 


dip 

d 2 d 

dzj + i (1 + i(p Xn ) dz n 


Lj(f)dm u (z) 




n 


R 


n 


df 




where we simplify the writing by sometimes omitting $ from the formu¬ 
las. Now, starting from the left-hand side of (28) 


d 


9u 3 


f{z)dm u {z) 


d 


(dl 

\9uj 


9u 3 




R 


n 


df 


+ ^Puj ) (1 + i^Px n ) —if ^Puj 


9f W' 9f 


dx 


dy, 


where we have used an integration by parts and the fact that / has 
compact support to obtain the second integral on the right-hand side. 
Using the fact that / has compact support again, we also note that 




d 


9xj 


[/($)(i + o] 


df_ 

9x 3 




dl 

^Vn 


(1 + i(f Xn ) - i ^fxj 


'R 


n 



where once again we have integrated by parts to obtain the last integral. 
Combining the two results above we find that 


d 


^ u j 


f{z)dm u (z) 


M u 

-2i 


r R 


n 



d(f df 
&z 3 dy n 


(1 -h i(fx 


n 


d(p 
R n ^ Z j 


R 打 


dz 3 


(1 + i(fx n ) - 4 


R 7 


d(f df 
dzj dz n 


Lj(f)dm u {z), 


M 


u 
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which is (28). 

Now set f(z) = e~ n ^ 2 e~^^ z ~^ 2 F(z)x(z). Then 



because of Lemma 7.4. Now Lj(f) = e _n ^ 2 e -7r ^ _ ^^ 2 / e FLj(x), since 
e -( 2 -C) 八 is holomQrphic in 之， and Lj(F) = 0 by assumption. How¬ 
ever Lj(x) is supported at a positive distance from B f . So if ^ E S'，the 
inequality (26) guarantees that 

d 0 | = 0(e- n/2 e- c ’ /e ) as e — 0 

for some c' > 0, and the property (iii) is established, under the assump¬ 
tion that F E C 1 . 

To complete the proof of the theorem note that a combination of (ii) 
and (iii) shows that F e ° converges uniformly to F when ^ E M Pi B f . Now 
each F e °, being an entire function of (，can be uniformly approximated 

by polynomials in C for ( in the compact set S'. Altogether then, F can 
be uniformly approximated by polynomials on M C\ B and the theorem 
is proved in that case. 

To pass to the general case note that what we have shown in (28) is that 
when / is of class C 1 , u— (0,..., 0, iXj, 0,..., 0), and = (0,..., 0, 

0, ... ， 0) then 

(29) F e u -F^ = - f L 3 (f)J(x) dxd yj , 

1 Jvj JR n 

To extend (29) to the case where / is merely continuous, and Lj(f) (taken 
in the sense of distributions) is also continuous, a limiting argument 
with (29), as it stands, will not suffice. This is because the “weak” 
definition of Lj(f) requires an integration over R n x R n_1 , while in (29) 
we only integrate over R n x R. To get around this we observe first (still 
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assuming / E C 1 and has compact support) that (28) implies 

(30) - [ f{^ y ， (x))^-(y , )J(x)dxdy , = 

jR n xR n_i oyj 

=-[ f(^ y， ( x )) Lt j dx d v f 

1 xR n_1 

for any C 1 function 0 on R n — 1 having compact support. Now at this 
stage we can pass to an arbitrary continuous / of compact support (by 
approximating such / uniformly by C 1 functions) and see that (30) holds 
for / that are merely continuous and of compact support. 

As a result we have that 

(31) - [ f(^ y ， (x))^-(y f )J(x)dxdy , = 

jR n xR n_i oyj 

=~ [ L j (f)^(y , )J(x)dx dy\ 

1 JR n xR 71 - 1 


where Lj(f) is taken in the sense of distributions (assuming that Lj(f) 
is continuous). 

Now set = *(%)A ⑼， where 公 is defined by ^ = (yi ,... 

2/j+i ， … ， 2/n-i). Here ^s{yj) = 1 if Vj < Vj < Uj, and vanishes if yj < 

< c5 _1 . As a result note that 

for any continuous function g 


Vj — 6 or yj > Uj -f 6; in addition 





dyj = g(Uj) - g(Vj) 


£LS S — ^ 0, 


since is the difference of two approximations to the identity centered 
at Uj and Vj, respectively. 

Also ^s{y) = 8~ n ^ 2 ^(y/8), where JT n _ 2 ^(y)dy = 1 , making {^} an 
approximation to the identity in R n_ ' Inserting these in (31) and let¬ 
ting 5^0 shows that the left-hand side of (31) converges to — 
while the right-hand side converges to f/: J / Rn Lj(J 、 dxdyj, and (29) 
is proved. The rest of the argument then continues as before, and the 
proof of the theorem is now complete. 


The approximation theorem just proved, together with the maximum 
principle in Section 6 lead directly to the famous Lewy extension theorem. 
Here again M is a C 2 hypersurface given in a ball with M = {z G 
B, p(z) = 0}. As before we set = {x £ B, p(z) < 0}. 
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Theorem 7.5 Suppose that the Levi form (18) has at least one strictly 
positive eigenvalue for each z E M. Then for each z° E M, there is a 
ball B’ centered at z° so that whenever Fq is a continuous function on 
M that satisfies the tangential Cauchy-Riemann equations in the weak 
sense, there exists an F which is holomorphic in Pi ， continuous in 
Pi and so that F(z) = Fq(z) for z G M C\ B’• 

To prove the theorem we first use Theorem 7.1 to find a ball B\ cen¬ 
tered at zo so that Fq can be uniformly approximated (on M H B\) by 
polynomials {p n (z)}. Then we invoke the corollary to Theorem 6.1 to 
find a ball B' so that (22) holds (with B\ in place oi B). Therefore the 
p n also converge uniformly in Pi B f , The limit of this sequence, F, 
is then holomorphic there, continuous in Pi B\ and gives the desired 
extension of Fq. 

8 Appendix: The upper half-space 

In this appendix we want to illustrate some of the concepts discussed in the present 
chapter, as viewed in terms of a special model region. We will only sketch the 
proofs of the results, leaving the details to the interested reader, and providing 
some further relevant ideas in Exercises 17 to 19. 

The region we have in mind is the upper half-space U in C n given by 

U = {z e C n : Im(z n ) > \z f \ 2 }, 

and its boundary 

(32) dU = {z ^ C n , Im(z n )= 卜 ’| 2 }， 

with z = (z\z n ) y and z’ = (zi, … ， z n _i). It is prompted by the canonical form (16). 
The region 以 in C n , n > 1, plays a role similar to the upper half-plane in C 1 . The 
definitions suggest that z n can be thought of as the “classical” variable, while z 
is the “new” variable that comes about when n > 1. As in the case n = 1， the 
region U is holomorphically equivalent with the unit ball {w ^ C n : | 切 | < 1} via 
a fractional linear transformation, namely 

i - z n 2iz k , , , 

切 n — ~ ~ : 扣 k ~ ~' ， K = 1, . . . , 72 — 1 ， 

I Z n I Z n 

as the reader may easily verify. 

This mapping also extends to a correspondence of the boundaries, except that 
the “south-pole” of the unit ball (0,..., 0, —1) corresponds to the point at infinity 
of dU. The analysis of the region U is enriched by a number of symmetries it 
enjoys. 

The boundary of U y which by (32) is parametrized by (z^Xn) ^ C n_1 x R, car¬ 
ries a natural measure d/3 = dm(z f , x n ), with the latter being Lebesgue measure 
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on C n_1 x R. More precisely, if Fo is a function on dlA, and Fq designates the 
corresponding function on C n_1 x R, 


then by definition 


F 0 {z\x n + i\z\ 2 ) = F^(z\x n ), 



Fodp = j 

JC n - 1 xK 


Fq dm. 


8.1 Hardy space 

In analogy with C 1 , we consider the Hardy space H 2 (U), which consists of all 
functions F, holomorphic in U, that satisfy 


sup / \F(z f , z n + ie)\ 2 dp < oo. 
e>0 JdU 


For those F the number ||F|| H 2 ⑼ is defined as the square root of the above supre- 
mum. It will be convenient to abbreviate F(z f , z n + ie) by F e (z), and sometimes 
also use the same symbol for the restriction of F e to dli. 

Theorem 8.1 Suppose F 6 H 2 (U). Then, when restricted to z ^ dU, the limit 


lim F e — Fo 

e—0 

exists in the L 2 (dlA 、 d(3) norm. Also 

II^IIh 2(^) = l|Fo||z^ ( 則 . 


For several arguments below we use the following observation. 

Lemma 8.2 Suppose B\ and are two open balls in C n — , with B\ C Then, 
whenever f is holomorphic in C n_1 


sup \f{z)\ 2 <c[ \f{w)\ 2 dm{w). 
z'^By J B 2 

Indeed for sufficiently small 5, whenever z ^ B\ then Bs{z) C B 2 , so since / is 
harmonic in R 2n_2 , the mean-value property and the Cauchy-Schwarz inequality 
gives 


\f(z)\ 2 < 


m(Bs) 


B S (z f ) 


\f{w f )\ 2 duiiw'), 


proving the claim. 

The proof of the theorem can be given by the Fourier transform representation of 
each F 6 H 2 {U) in analogy with the case n = l treated in Chapter 5 of Book III. 
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We define the space 7i of functions f(z\ A), with (z\ A) ^ C n_1 x R + , that are 
jointly measurable, holomorphic in z £ C n_1 for almost every A, and for which 

\\f\\n = [ [ \f(z\ A)| 2 e~ 47rA|2/|2 dm{z) d\ < oo. 

Jo yc n_i 

One can show that with this norm the space 7i is complete and hence a Hilbert 
space (see Exercises 18 and 19). With this, every F G H 2 (U) can be represented 
as 

POO 

(33) F(z f ， z n ) = \)e 2niXzn dA, with f 

Jo 


Proposition 8.3 If f G7i, then the integral in (33) converges absolutely and uni¬ 
formly for (z\ z n ) lying in compact subsets ofU, and F ^ H 2 (U). Conversely any 
F € H 2 (U) can be written as (33) for some f 

In fact if (z\ z n ) belongs to a compact subset of U, we may suppose that 
Im(zn) > \z\ 2 + €, for some € > 0. We will also restrict z to range in a ball 
B \, with B\ CZ ^^^ 2 ， and take the radius so small that 1111 ( 1 / 77 ^) > K + W ， 
if w G B 2 . 

Now by the Cauchy-Schwarz inequality the absolute value of the integral in (33) 
is estimated by 


( 乂 °° |/(z' ； \)| 2 e _47rA(2M_e/2) 



e ~4n\e/2 



1/2 


Invoking the lemma we get as an estimate for this 


c 


(I ； 


C 71 — 1 


1/2 

A)| 2 e _47rA 卜丨 dm(w f ) dX ) c’e - " 2 = c"c _ 1 / 2 ||/||h. 


This shows that the integral converges absolutely and uniformly when z G B\ and 
Im(z n ) > \z\ 2 + €, and thus uniformly on any compact subset of IA. Thus F is 
holomorphic in U. Observe next that for F given by (33), F e (z) = F(z\ z n + ie) is 
given in terms of / e , with / e (z’ ， A) = f(z f , \)e~ 2nXe . Now for fixed z , Plancherel’s 
theorem in the x n variable shows that 

p OC 

\F e (z\x n - {- i\z f \ 2 )\ 2 dx n = / \f e (z\ \)e~ 2 ^ Z， ^\ 2 d\. 


R 


Integrating in z gives 


l^| 2 ^=||/ e ||^ <||/||^. 


dU 


By the same token, f du \F e — F e t | 2 d/3 = \\f e — fe ( \\u — 0 as €, e f —>• 0. Thus F e 
converges in L 2 (dU^ d/3) to a limit Fo given by (33) with y n = \z f \ 2 . Moreover 


(34) 


\Fo\\L 2 (dU) — \\^\\h 2 (u) — \\f\\n- 
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Conversely, suppose F E H 2 (U). One observes that whenever z is restricted to a 
compact subset of C n_1 , 

\F(z\z n + 2 €)| < ^ 11 ^ 11 ^ 2 . 

(Here we use Lemma 8.2 and also follow the reasoning used in the case n = 1 to 
study in Section 2 of Chapter 5 in Book III.) We set F^(z) = F{z\ z n + 

2€)(1 — i6z n )~ 2 . Then for each z’，the function F^{z\ z n ) is in H 2 of the half-space 
(Im(z n ) > |z / | 2 }. So we may define / e 5 {z\ A) by 

f e _ 2 神一々 e V ，〜 x 

Jr 

noting that the right-hand side is independent of y n , if y n > |z ’| 2 ， by Cauchy’s 
theorem. Also then Ff is represented by (33) with / e 5 in place of / and / e 5 G 7i. 

Now letting △一 0 and using (34) we see that F e (z) is given by (33)，with 
f e = / e 5 1 5=0 in place of /， and that f e Finally, since F e (z) = F(z\ z n + k), 

we have that A)= = f(z\ A)e -27rAe , and using (34) again with € —^ 0 gives us 

the representation (33) for our given F £ H 2 (U). The theorem is thus proved. 

Remark. By the completeness of 7i given in Exercise 19 we see that H 2 (U) is 
also a Hilbert space. 


We now ask: 

Which Fo £ L 2 (dU) arise as lim e ->o for F £ H 2 (U) ? 

When > 1 the tangential Cauchy-Riemann operators provide the answer. If 
Fo is given on dlA^ recall that Fq(z\xti) = Fo(z\ x n + i\z\ 2 ) is the corresponding 
function on C 71 ^ 1 x R. In this setting the vector fields Lj, given by 


Lj 


d 


&Zj 




d 


dx n 


'J 


j = 1，…， n - 1 ， 


form a basis for the tangential Cauchy-Riemann vector fields, as is given by (24), 
with p(z)= 卜 ’| 2 — Im(z n ). Note that in this case Lj = —Lj. So here a function 
G £ L 2 (C n_1 x R) satisfies the tangential Cauchy-Riemann equations Lj (G) — 0, 
j = 1，…， n — 1, in the weak sense, if 


(35) / G(z\ , x n ) dm(z\ x n ) = 0, 1 < j < n - 1, 

Jc n ~ l xR 

for all xp that are (say) C°° and have compact support. 

Proposition 8.4 An Fo in L 2 (dU) arises from an F ^ H 2 (U) as in Theorem 8.1 
if and only if Fq satisfies the tangential Cauchy-Riemann equations in the weak 
sense. 


Proof. First, assume that F £ H 2 (U). Then since F e is holomorphic in a 
neighborhood of", the function Ff satisfies Lj (F e tt ) = 0 in the usual sense. The fact 
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that F e Fo in the L 2 (dU) norm (which is the same as Fq in L 2 (C n_1 x R)) 

then implies that Fq satisfies (35) with G = Fq. 

Conversely, suppose G is in L 2 (C n_1 x M), and set 


(36) 


〆〆 ， A) = e 


— 2niXxi 


G{z\ x n ) dx n , 


'R 


Also choose xp(z\x n ) = ^pi(z , ) , tp 2 (x n ). Then by Plancherel’s theorem in the x n 
variable, 

G(z\x n )^^(x n ) dx n = - [ g(z\ X)27vi\ip2(—X) dX 

r K OX n Jr 

for almost every z . Integrating in z then shows that 


r C n —i xR 


G(z\x n )L t j (^p(z\x n )) dm(z\x n ) 


c n . 


J g{z ,\) - 27rXz j xpi(z ， )^ 7p 2 (-X)dX dm{z 


So if G satisfies (35) it follows that 


g{z ,\) - 2T ： \zj^i{z) \ dm{z) = 0 


C n _ 


for almost every A, and this means that 


c n _ 


⑺坤，卜 0 , 


where f{z\ A) = g{z\ A)e 27rA ' 2 ^ 2 , which itself implies that f(z\ A) satisfies the 
Cauchy-Riemann equations in C n_1 in the weak sense, for almost every A. But we 
saw in Section 1 that this shows that the functions f(z\ A) are holomorphic in z . 
Now (36) and the Fourier inversion formula shows that 



\g{z\ A)| 2 drrt{z) d\ 



\f(z\X)\ 


2 e 一 47rA l z ’l 2 


dm(z) dX 


are both finite. Also, with F given by (33), we have G(z\x n ) = F{z\x n + i\z\ 2 ). 

會 2 

Finally, because f cn -i |/(z’ ， A)| 2 e _4?rA 卜丨 dm{z) < oo for almost every A, then 

necessarily /(〆 ， A) = 0 for those A that are negative. Thus we have given G as Fq , 
with F as in (33), and f cTi. The proposition is therefore proved. 


8.2 Cauchy integral 

The Cauchy integral 7 in U can be defined as follows. For each z^w ^ C n we set 

r(z,w) = ^(w；n - Zn) - z -W 


7 Also referred to as the Cauchy-Szego integral. 
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with z 二 z n ), w = ,w n ) and 

Z 'W — Z\W\ + . • • + Z n -\W n -\. 

Note that r(z, w) is holomorphic in z, conjugate holomorphic in w, and r(z, z)= 
Im(z n ) — \z\ 2 = —p(z), with p the defining function for U used earlier. 

Next, we define 

S(z,w) = c n r(z,w)~ n , where c n = ((: 求 • 

Observe that S(z, w) = S(w, z), and that for each w G K, the function z i—^ S(z, w) 
is in H 2 (U). Also for each z £U, the function w »-> S(z, w) is in L 2 (dU). We define 
the Cauchy integral C(f) of a function / on ^ by 

(37) C(f)(z) = ( S(z,w)f(w) dp(w), z £U. 

JdU 

The reproducing property of C is what interests us here. 

Theorem 8.5 Suppose F £ H 2 (U), and let Fq = lim e _o as in Theorem 8.1. 
Then 


(38) 


C(F 0 )(z) = F(z). 


The key lemma used is an observation giving a reproducing identity for a related 
space of entire functions on C n_1 . We consider the holomorphic functions / on 
C n_1 for which 



|/(z’)| 2 e _47rA l z I dm{z) < oo, 


where A > 0 is fixed. 

Lemma 8.6 For f as above, we have 


(39) f(z) = ( K x {z dm(w ， ) 

JCri-l 

with ，^) - (4A) n - 1 e 47rA2， -^ / . 

Proof. In fact, consider first the case when 4A = 1, and z — 0. Then (39), 

which states /(0) = f cn -i /(u> ， ) e_7r ' u， * dm(w , ), is a simple consequence of the 
mean-value property of / (taken on spheres in C n_1 centered at the origin) and 

參 2 

the fact that f cn -i e~ n ^ z ^ dm(z) = 1. 

We now apply this identity to tt/ h f(z’ + w / )e~ nz ' w， for fixed z . The result 
is then (39) when 4A = 1. A simple rescaling argument then gives (39) in general. 


Turning to the proof of the theorem, we observe that 

S(z,w) = x n - l e~ AnXr{z ' w) dX, 

Jo 
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since / 0 °° X n ~ 1 e~ AX d\ = (n — l)\A~ n ^ whenever Re(A) > 0. So, at least formally, 


S(z, w)Fo(w) dp 


du 


0 JdU 


Fq(w\ u n + i|it/| 2 )A n_ 1 e _ 47 rAr ( 2 , ⑴） dm(w’ ， u n ) d 入 . 


But as we have seen 


Fo(w\u n + iv n )e~ 27TlX ^ Un+lVn ^ du n = f(w\ A). 


r R 


Now insert this in the above, recalling that r(z, w) = — w n~ z n — / . 研 , ， a nd that 


(4A) n_1 f cn -i f(w\ A)e _4?rA ^ ' dm(w , ) = /(z’ ， A). The result is that 


S(z,w)F 0 (w)dp(w) = I f(z f , \)e 2niXZn dX, 
eu Jo 


which by (33) is what we want to obtain. 

To make this argument rigorous, we proceed as in the proof of Theorem 8.1, with 
the improved function F e 5 in place of F. Then all the integrals in question converge 
absolutely, and therefore the interchanges of integration are justified. This gives 
the reproducing property (38) for F € 5 instead of F. Then we let 5—^0, and next 
e —>• 0, giving (38) for any F G H 2 (U). 


8.3 Non-solvability 

We will use the Cauchy integral C to illuminate a basic example of Lewy of a 
non-solvable partial differential equation. 

Here we look at U in C 2 , with its boundary parametrized by C x R. We consider 

the tangential Cauchy-Riemann vector field L = Li = - and show that 

in order for L(U) = f to be even locally solvable, the function / must satisfy a 
stringent necessary condition. For purposes of the statement of the result, it will 
be more convenient to deal with 

t d t d 

L = ^ - h tZi - — 

UZi UX2 

instead of L. (To revert back to L then one needs only to replace / by its conju¬ 
gate.) 

We consider the Cauchy integral (37), written now as acting on functions on 
C x R, identified with dli in C 2 . If / is such a function then (37) takes the form 

(40) / S(z,U 2 + ^I^i| 2 )/(^ 1 , ^ 2 ) dm(wi,u 2 ). 

JCxR 

We can extend (40) to define the Cauchy integral when / is a distribution (say of 
compact support), by setting 

^(f)( z ) = (f,S(z,U 2 + 和 il 2 )〉， z eu. 
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Here〈•，•〉is a pairing between the distribution / and the C°° function (w\ , U 2 ) ^ 

S(z, U 2 + i\wi | 2 ), with z fixed. The necessary condition is then: 

(41) C(f)(z) has an analytic continuation to a neighborhood of 0. 

Note that this property depends only on the behavior of / near the origin. Indeed, 
if /1 agrees with / near the origin, then C(f — fi) is automatically holomorphic 
near the origin, because visibly S(z, w) is holomorphic for z in a small neighborhood 
of the origin, with w staying outside a given neighborhood of the origin in C n . 

Theorem 8.7 Suppose U is a distribution defined onC x R } so that L(U) = f in 
a neighborhood of the origin. Then (41) must hold. 

Proof. Assume first that U has compact support, and L(U) = f everywhere. 
Then 

^(f)( z ) = (f^S(z,u 2 - \-i\wi\ 2 )) = (L(U), S(z,u 2 - i\wi\ 2 )) 

=-(U } L(S(z,u 2 + i\wi\ 2 ))) 

= 0 , 

since L(S(z, U 2 + i\wi | 2 )) = 0, because w »—> S(z, w) is conjugate holomorphic. Thus 
trivially C(f)(z) is holomorphic everywhere. 

If U does not have compact support and L(U) = f only in a neighborhood of the 
origin, then replace U by r}U^ with rj a C°° cut-off function that is 1 near the origin. 
With U f = r]U, then L(U , ) = f everywhere, so C(f f ) = 0 but C(f — f f ) is analytic 
near the origin because / — /' vanishes near the origin of C x R. Therefore (41) 
holds. 

We give a particular example. Take the function 

F(Zi ， Z2) = g-( 2 2/2) 1/2 e -(z/2 2 ) 1/2 _ F{^ Z2 y 

It is easy to verify that F is holomorphic in the half-plane Im(Z 2 ) > 0, continuous 
(in fact C°°) in the closure, and rapidly decreasing as a function of (^i, Z 2 ) 6 U. 
However it is clearly not holomorphic in a neighborhood of the origin. 

Now set / = F|aw, that is, in the C x R coordinates, /(^i,X 2 ) = F(x 2 + i\zi\ 2 ). 
However C(f) = F by Theorem 8.5. 

Thus we have reached the conclusion that L(U) = / is not locally solvable near 
the origin, even though this particular / is a C°° function. 

9 Exercises 

1. Suppose / is holomorphic in a polydisc P r (z°), and assume that / vanishes in 
a neighborhood of z°. Then / = 0 throughout P r (z°). 

[Hint: Expand f(z) = ^2a a (z — z°) a in P r (z 0 ), using Proposition 1.1, and note 
that all a a are zero.] 


2. Show that: 
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(a) If / is holomorphic in a pair P (T (z°) and P T (z°) of polydiscs centered at z° 

with a = (<7i, , cr n ) and r = (n,..., r n ), then / extends to be holomor¬ 
phic in P r (z°), wherever r = (r\, ..., r n ) and Tj < , l < j < n, for 

some 0 < ^ < 1 . 

(b) If S = {s = (si,... } 5 n ), Sj = log Tj , where / is holomorphic in P r (z 0 )}, 
then S is a convex set. 

[Hint: Consider ~ z °) that represents / both in P < 7 (z°) and P r (z 0 ).] 

3. Given any open subset of C 1 , construct a holomorphic function f in Q that 
cannot be continued analytically outside Q. 

[Hint: Given any sequence of points {zj} in Q, which does not have a limit point 
in il, there exists an analytic function in vanishing exactly at those Zj.] 


4. Suppose Q is a bounded region in C n , n > 1 , and / is holomorphic in Q. 
Suppose Z, the zero set of /， is non-empty. Then Z intersects dil, that is, Z D dil 
is not empty. 

[Hint: Let it; be a point in Q, . Let z £ Z be a, point furthest from w. Define 7 to 
be the unit vector in the direction from zo to w, and let v be another unit vector 
so that both v and iv are perpendicular to 7 . Consider the one-variable function 
h e (Q given by h € (C) = f(z° — €7 + C"). Then for € > 0, the function h e (Q does 
not vanish in a fixed neighborhood of C = 0 .] 


5. Suppose / is continuous and has compact support in C 1 . 

(a) Show that u = f * ^ in Proposition 3.1 belongs to Lip(a), for every a < 1. 

(b) Show that u is not necessarily in C 1 . 

[Hint: For (b) consider f(z) = z(log(l/|z|)) € but modified away from the origin to 
have compact support.] 


6 . Verify the identity in C 1 


F ㈤ 




l [ 

^ JdnC-z 7T J n {C-z) 


(dF/dO(C) 


2ni 


dm{Q 


for appropriate regions Q and C l functions F. Use this identity to give an alter¬ 
native proof of Proposition 3.1. 


7. Prove the following. The necessary and sufficient condition that the solution 
u{z) = ★/ dm((^) of du/dz = / in C 1 , have compact support when / has 

compact support, is that 


0(0—(C) 二 0 , 


for all n > 0 . 
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[Hint: In one direction, note that -^=(z n u(z)) = z n f(z). For the converse, observe 
that for large z, u(z) = Z^L 0 a n 2 T n_1 ，with a n = ~ f C/(C) d 爪 (C).] 

8 . Suppose Q is a region in R d with a defining function p that is of class C k . 

(a) If F is a C k function defined on R d and F = 0 on dQ, show that F = ap, 
with a ^ C k ~ l . 

(b) Suppose Fi = F 2 on dil. Show that if X is any tangential vector field then 

^(FOIan = X(F 2 )|an. 


[Hint: Write F\ — F 2 = ap] 

9. Verify that the extension F given by Theorem 4.1 is the unique solution to the 
Dirichlet problem for Q with boundary data Fq. 


10. Use the region {z £ C n : p < \z\ < 1 } to show that the connectedness hy¬ 
potheses in Theorem 3.3 and Theorem 4.1 are necessary. 


11. That the connectedness properties in the hypotheses of Theorems 3.3 and 4.1 

are related can be seen as follows. Suppose Q is a bounded region with C l 
boundary. For € > 0, let be the “collar” defined by {z : d(z, dil) < e}, and 
let fl Then for sufficiently small e the following are equivalent: 

(i) Q c is connected, 

(ii) is connected, 

(iii) is connected. 

[Hint: For instance to see why (ii) or (iii) implies (i), suppose Pi and P 2 are two 
points in Q , and let Ti and 厂 2 denote the connected components of Q which 
contain Pi and P 2 respectively. Connect P\ to a point Qi on dQ D Ti, and P 2 to 
a point Q 2 on dil fl Since is connected one can then connect Q\ to Q 2 by 
a path in 0°. 

Conversely, to show that (i) implies (iii) for example, let A be a point in Q, and 
B a point in Q . If Po and Pi belong to dQ, let 70 be any path starting at A 
traveling in il, passing through Po, then traveling in Q c ending at B. Similarly, let 
71 be path connecting AtoB passing through P\. These paths can be constructed 
because both and Q c are connected. Then, since C n is simply-connected, deform 
the path 70 into 71 , and denote such transformation by 5 »—> 7 S with 0 < 5 < 1. To 
conclude, consider the intersection of 7 S with dQ.} 

12. Let Q be a simply connected bounded region in C 1 with a boundary of class C 1 . 
Suppose Fo is a given continuous function on dil. Show that a necessary and 
sufficient condition that there is an F, holomorphic in continuous on Q, so that 
F = Fq on dQ, is that f dQ z n Fo(z) dz = 0, for n = 0, 1 ， 2,. … 
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[Hint: One direction is clear from Cauchy’s theorem. For the converse define 
F ± (z) = 2 ^- f dQ according to whether z ^ Q or z ^ Q°. Now the hypothe¬ 

sis implies that F + (z) = 0^ z G Q°. Also F~ {z) — F + (z) Fo(C), z — C if C G 
z £ the segment [z，（] is normal to the tangent line of dQ at C, and z is the re¬ 
flection of z across that line. That is, 二 C ，z £ Cl . The convergence asserted 

is related to the expression of the delta function given by inS = ^ (^75 — , 

in Section 2 of Chapter 3.] 

13. Show that with an additional change of variables, that is, introducing complex 
coordinates, the canonical representations (16) and (17) of the boundary can be 
simplified to state 



2/n = [ AjIzjI 2 + o(|z’| 2 ), for z’ — 0. 
j = i 


[Hint: Consider the change of variables z n 1 —► z n — z n (cizi + …+ c n -i^n-i + Dz n ), 
Zj ^ Zj y 1 < j < n — 1, for suitable constants Ci, ..., c n -i•] 


14. The fact that when n = 1 there are no local holomorphic invariants at bound¬ 
ary points is indicated by the following fact. Suppose 7 is a C fc curve in C 1 . 
Then for every z° G 7 , there is a holomorphic bijection of a neighborhood of 
z° to a neighborhood of the origin, so that <^( 7 ) is the curve {y = (/?(x)}, with 
^p(x) = o(x k ) as x —^ 0 . 

[Hint: Suppose y = a 2 X 2 + • • • + akX k + o(x k ) as x —^ 0, and consider <l > _1 defined 
by 少 _1 ⑷ =zi (j2 k j=2 

15. Consider the hypersurface M in C 3 given by M = {Im(Z 3 ) = \z± | 2 — | 之 2 | 2 }. 
Show that M has the remarkable property that any holomorphic function F defined 
in a neighborhood of M continues analytically into all of C . 

[Hint: Use Theorem 7.5 to find a fixed ball B centered at the origin so that F 
continues into all of B. Then rescale.] 


16. That the maximum principle of Theorem 6.1 does not hold in the case n = 1 
can be seen as follows. Start with f(e l6 ) £ C°°, so that / > 0, f(e 10 ) = 0 for |^| < 
7 T/ 2 , f(e 10 ) = 1 for 3 tt/4 < |^| < tt. Write f(e ie ) = 5：=。 + E -^' 1 〜 e— ， 
G(z) = and Fn(z) = e NG 、 z 、 . Verify that Fn is continuous in the 

closed disc \z\ < 1 , \FN(e l9 )\ = 1 , for |^| < n/2 but |Fat(z)| > cie C 2 ； v ( 1 _ l z l) in the 
closed disc, for two positive constants ci and C 2 . 

[Hint: G(z) — u-\- iv where n(r, 0) = f * P r , with P r the Poisson kernel.] 


IT. Verify the following: 

(a) The inverse of the mapping of U to the unit ball given in the Appendix is 



( 1 —U>n 
1 +t^n 


and Zk = i+t n 、 k = 1 ,... ,n - 1 . 
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(b) For each ((，《）£ C n_1 x R consider the following “translation” on 
given by 

=(〆 + C，〜+ 亡 + 2i(Z’ • 0 + <|C| 2 ). 

Then r(( ， t) maps U and dU to themselves, respectively. Composing these 
mappings leads to the composition formula 

(C，0 . (C’，0 = (C + C ’， 亡 + 亡 ’ + 2Im(( •()). 

Under this law C n_1 x R becomes the “Heisenberg group.” 

(c) U (as well as dU) is invariant under the “non-isotropic” dilations (z\z n ) —>■ 
(Sz\ S 2 z n ), S > 0. 

(d) Both U and dU. are invariant under the mappings (z\z n ) ^ (u(z f ), z n ) y 
where w is a unitary mapping of C n_1 . 


18. Define 7i\ to be the space of functions / holomorphic in C n_1 ，for which 

[ |/W| 2 e - 4?rA|z|2 dm(z) = \\f\\n x < oc. 

Show that: 

(a) 7ix is trivial if A < 0. 

(b) 7i\ is complete in the indicated norm, so 7i\ is a Hilbert space. 

(c) Define P x (f)(z) = / cn _i f(w)Kx{z,w)e~ 4nX ^ w{2 dm(w), where K\{z,w)= 
(4A 广 1 e—• 

Then P\ is the orthogonal projection of L 2 (e~ 47rX ^ w ^ 2 dm(w)) to 7i\. 

[Hint: Show that convergence in the norm 7i\ implies uniform convergence on 
compact subsets of C n_1 ，using Lemma 8.2.] 

19. Prove: 

(a) The space 7i in Section 8.1, is complete, and hence is a Hilbert space. 

(b) Show that the Cauchy integral / i—^ C(f) gives the orthogonal projection 
from L 2 (dU,d/3) to the linear space of functions Fo that arise as lim € —o F e , 
for F ^ H 2 (U). 


[Hint: For (a), use the previous exercise. 
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10 Problems 


The problems below are not intended as exercises for the reader but are 
meant instead as a guide to further results in the subject. Sources in 
the literature for each of the problems can be found in the “Notes and 
References” section. 

1.* Suppose / = f(z \,..., z n ) is defined in a region Q C C n , and for each j, 1 < 
j < n, the function / is holomorphic in Zj with the other variables fixed. Then 
/is holomorphic in Q. This was shown at the start of the chapter when / is 
continuous, and the point of this problem is that no condition on / is required 
besides the analyticity in each separate variable. 

An important ingredient in the proof of this result is an application of the Baire 
category theorem. 


2.* Assume / is holomorphic in a neighborhood of the origin and /( 0 ) = 0 . Let 
m = : ^2 a aZ Q be the power series expansion of / valid near the origin. The order 
of the zero (at z = 0 ) is the integer k that is the smallest \a\ y for which a a 一 0 . 
Then, after a linear change of variables, we can write f(z) = c(z)P(z) near the 
origin, where P(z) = + ak-\{z)z^~ l + … + ao{z) with (z’ ， z n ), and c(z) _ 0 

while afc-i( 0 ) a 0 (0) = 0. This result is the Weierstrass preparation 

theorem. 

[Hint: Assume that our coordinate system {z , z n ) £ C n_1 x C is such that /( 0 , z n ) = 
Then by Rouche’s theorem we can choose €, r > 0, so that Zk —^ f(z ’， Zk) has k 
zeroes inside the disc \zk\ < r, but is non-vanishing on the boundary，for all |z | <C c. 
Let 71 ( 2 /), 72 ( 2 /) ， … ， 1 k{z) be an arbitrary ordering of these zeroes. Then the 
symmetric functions (J\{z) = ^£=1 「 2 ( 2 /) = 7 ^( 之 ’) 7 爪 ( 之 ’), ..• ， are 

holomorphic in z’，for \z\ < e. This follows since the sums Sm{z) : 

1 < m < k, have this property because they are given by the formula 




2 丌 i 






Now we need only take ak-j{z , ) = (—iy crj(z , ) } and the result holds for P(z) 
Zn + a k -i(z , )z k ~ 1 + • • • + a 0 (z).] 




3.* The original proof of Theorem 4.1 represented F in terms of Fo by Green’s 
theorem via the “Bochner-Martinelli integral.” The result then held for Fo merely 
of class C 1 . 


4.* We are concerned with the problem 
(42) du = /, on 

where is a bounded region in C n with C°° boundary and / is given in Q with 
df = 0 there. 

(a) If Q is pseudo-convex, and / G C°°(f2) then there is u G C°°(f2) that solves (42). 
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(b) The “normal” solution (if it exists) is defined as the (unique) solution u in 
L 2 (Q) for which 


I wFdm(2) = 0 

Jn 

for all F that are holomorphic in Q and are in L 2 there. For Q that are 
strongly pseudo-convex (and many other classes of whenever / £ C°°(f2), 
the normal solution u also belongs to This results from the study 

of the “3-Neumann problem.” 


5. * Domains of holomorphy. A domain of holomorphy is a region Q with the 
property that there exists a holomorphic function F so that for every z° ^ 
the function F cannot be continued into some ball centered at z°. If is a 
domain of holomorphy and has boundary of class C 2 , then Q is pseudo-convex, by 
Theorem 7.5. Conversely, it can be shown that if Q is pseudo convex, then it is a 
domain of holomorphy. 

6. * The converse to Theorem 8.7 holds. If / is a distribution with compact support 
so that C(f)(z) is analytic near z = 0 then L(U)= - f is locally solvable near the 
origin. 

This is proved by finding a kernel K so that the convolution operator T(f)= 
f * K on the Heisenberg group is a relative inverse to L in the sense that LT(f)= 
f - C(f). Then write f = f - C(f) + C(f) = /i + h, with /i = / - C(f) and 
f 2 = C(f). We can solve L{U\) = /i by what has just been asserted, and we 
can solve L{Uu^) = f 2 locally by the Cauchy-Kowaleski theorem, since /〗 is real- 
analytic at the origin. 



Oscillatory Integrals in 
Fourier Analysis 


The origin of my devotion to these problems is after I 
attended in 1839 Nichol’s Senior Natural Philosophy 
class, I had become filled with the utmost admira¬ 
tion for the splendor and poetry of Fourier... I asked 
Nichol if he thought I could read Fourier. He replied 
‘perhaps.’ He thought the book a work of most tran¬ 
scendent merit. So on the 1st of May... I took Fourier 
out of the University Library; and in a fortnight I had 
mastered it - gone right through it. 

W. Thompson (Kelvin), 1840 


This result might also have been obtained from the 
integral U in its original shape, namely, 
/ 0 °° cos(x 3 — nx) dx ... If x\ be the positive value of 
x which renders x 3 — nx a minimum, we have x\ = 
3 _ H Let the integral U be divided into three parts, 
by integrating separately from x = 0 to x = X\ — a y 
from x = x\ — a to x = Xi + 6 ， and from x = Xi + 6 
to x = oo; then make n infinite... 

G. G. Stokes, 1850 


The study of oscillatory integrals and their asymptotics has been a 
vital part of harmonic analysis from the beginnings of the subject. The 
Fourier transform and the attendant Bessel functions provided initial 
examples of such oscillatory integrals. One should also note the study of 
asymptotics in the early works of Airy, Lipschitz, Stokes, and Riemann. 
In the work of the last two, the principle of stationary phase appears, 
if only implicitly; for Stokes it was in a reexamination of Airy’s integral 
and for Riemann it was in the calculation of certain Fourier series. This 
principle was then used more generally by Kelvin in an 1887 paper on 
water waves. The application of these ideas to number theory and lattice 
point problems was initiated in the first quarter of the next century by 
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Voronoi and van der Corput, among others. 

Given this long history it is an interesting fact that only relatively re¬ 
cently (1967) did one realize the possibility of restriction theorems for the 
Fourier transform, and the relation of the above mentioned asymptotics 
to differentiation theory and maximal functions had to wait another ten 
years to come to light. 

Here we present an introduction to the development of some of these 
ideas. Of importance to us is the bearing of certain geometric consider¬ 
ations (involving curvature) on the decay of the Fourier transform and 
these are explained by the behavior of oscillatory integrals. 

Two pillars of the theory are: averaging operators, and restriction 
theorems for the Fourier transform. Once we have described some basic 
facts about these, we apply the results of the restriction theorems to 
partial differential equations of “dispersion” type. We also reexamine 
the Radon transform, emphasizing its common traits with the averaging 
operator. Finally, we turn to the problem of counting lattice points and 
see what the ideas of oscillatory integrals teach us. 


1 An illustration 

We begin with a simple example that hints at the role of curvature in 
harmonic analysis. The setting is M d with d 二 3, and we consider the 
averaging operator A that gives for each function / its average over 
the sphere of radius 1 centered at x. It can be written as 




47T 



f ( 工一 y)da(y )， 


with da the induced Lebesgue measure on the sphere 5 2 = {x E K 3 : 
\x\ = 1}. (See Book III, Chapter 6 for the definition and properties 
of da.) 

The unexpected fact about the operator A is that it smooths / in 
several senses, the simplest one being that when / G L 2 (R 3 ), then A(f) 
will have first derivatives also in L 2 . This is expressed in the inequality 


⑴ 


d 


dxj 


A(f) 


< ^WfWv 


1,2,3. 


L 2 


More precisely, this estimate states that for / E L 2 , the convolution 
点 (/ * da), which is itself an L 2 function (see for instance Exercise 17 in 
Chapter 1), has first derivatives taken in the sense of distributions that 
are L 2 functions and that satisfy (1). 
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Now these assertions are a direct consequence of a corresponding esti¬ 
mate for the Fourier transform da of the measure da 、namely 

3 ^( 0 = f e- 2irix ^da{x). 

Js 2 

— . 

In the present case one knows da explicitly: 

$ 2sin(27r|《|) 

from which it is evident 1 that 

( 2 ) 1^(01 + 

Now simple manipulations of distributions and their Fourier transforms 
(see Section 1.5 in Chapter 3) show that (/ * da) A = fdcr, and 

(長 A ⑴) = 

so (1) follows from (2) and PlancherePs theorem. 

The results above have extensions to d dimensions for all d > 1. We 
define the averaging operator A in R d by 

■ = ^rry - — 

with da the induced measure on the unit sphere S d_1 • We also recall 
the Sobolev space L\ described in Section 3.1 of Chapter 1. 

Proposition 1.1 The mapping f ^ A(f) is bounded from L 2 (R d ) to 
Lf(R d )，with k 二 

Note that if d is odd (and hence k is integral), this means 

M<k 

The proof of that proposition relies on properties of Bessel functions 
which we do not prove here. However, these may be found in Book I, 

1 This formula follows by integrating over S 2 , using polar coordinates; see Chapter 6 
in Book III. 
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Chapter 6, Problem 2, and Book II, Appendix A. In any case, we will see 
below that these results can be deduced without the use of the theory of 
Bessel functions. 

Proof. The proposition is a consequence of the identity 

(3) 3^(0 = 2 吨 「〜 2+1 J d/2 —i(2 吨 I )， 

where da(^) = f sd -i e~ 2irtx '^ da(x)^ and J m is the Bessel function of or¬ 
der m. In turn this is just another version of the formula for the Fourier 
transform of a radial function f(x) = /o(|x|), given by /(^) = F(|^|), 
with 

⑷ F(p) = 27rp _d/2+1 f Jd/ 2 -\{^pr)fo{r)r d/2 dr, 

Jo 

from which (3) follows by a simple limiting argument. Prom (3) we obtain 
the key decay estimate 

(5) 1^(01 <0(1^-^) as|《| —(X). 

Indeed, (5) is deducible from (3) and the asymptotic behavior of the 
Bessel functions that guarantees that J m (r) : = 0(r _1/2 ) as r — oc. 

Once (5) is established the proof of the proposition is finished via 
Plancherel’s theorem as in the case d = 3. 

The following comments may help put the result in perspective. 

• It is natural to ask if it is some special feature of the sphere among 
hypersurfaces (for instance, its symmetry with respect to rotations) 
that guarantee the crucial decay estimate (5), or does that phe¬ 
nomenon hold in more general circumstances for hypersurfaces M? 
We will see below that the analog of (5) is true when an appropriate 
“curvature” of M is non-vanishing. 

• Moreover, simple examples show that anything like (5) fails com¬ 
pletely when M is “flat” (Exercise 2), and more generally, whatever 

— . 

decay one might hope for da(^) is linked to the degree to which the 
curvature of M does not vanish. 

• One can also observe that the degree of smoothing k = (d — 1)/2 
asserted in Proposition 1.1 can only happen in the context of L 2 , 
and not for L p , p ^ 2. (A result in this direction is outlined in 
Exercise 7.) 
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• Finally, it is interesting to remark that when d = 3 the averaging 
operator furnishes the solution of the wave equation A x u(x, t)= 

(x ， t) G R 3 x R with u(x, 0) = 0 and 聲 (x ， Q) = f(x). 
The solution for time t = 1 is given by u(x^ 1) = A(f)(x)^ and for 
other times it can be obtained by rescaling. (See Chapter 6 in 
Book I, where A is denoted by M.) 


2 Oscillatory integrals 

Certain basic facts about oscillatory integrals will allow us to generalize 
the decay estimate (5) we have obtained for the sphere. What we have 
in mind are the integrals of the form 


⑹ 


I{\)= / e lX ^ x) ^{x)dx, 

JR d 


and the question of their behavior for large A. 

The function 少 is called the phase and 0 the amplitude. In what 
follows we assume that both the phase $ and the parameter A are real¬ 
valued, but ^ may be allowed to be complex-valued. 2 

There is a basic principle underlying the analysis, that of stationary 
phase: in so far as the derivative (or gradient) of the phase is non¬ 
vanishing, the integral is rapidly decreasing in A (and thus negligible); 
thus the main contribution of (6) comes from those points x where the 
gradient of ^ vanishes; so when d = l these are the x for which = 0. 

The first observation along these lines is merely an extension of a 
simple estimate for the Fourier transform (effectively the case <^(x)= 
2tt 南 .:r ， and A = |^|). We assume here that 少 and xp are C°° functions, 
and that xp has compact support. 


Proposition 2.1 Suppose |V^>(x)| > c > 0 for all x in the support of^p. 
Then for every iV > 0 


刚 I < N , whenever A > 0. 


Proof. We consider the following vector field 


L 


1 d 

丄 V 

a 乙 

k=l 


d 




dxk i\ 


a.V )， 


2 However in some circumstances it is of interest to allow ^ or A to be complex valued. 
This arises in particular when d — \ and 伞 (and tp) are analytic and the integral (6) is 
treated by deforming contours of integration, as in Appendix A of Book II. 
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with a = (di, … ， ad) = Then the transpose L l of L is given by 


L l U)= 



Because of our assumption on ▽ 少 ， the aj and all their partial derivatives 
are each bounded on the support of 

Now observe that L(e lX ^) = e lA ^, therefore L N (e tX ^) = e lX ^ for every 
positive integer N. Thus 

I{\)= [ L N {e iX ^)xpdx= [ 

JRd J^d 

Taking absolute values in the last integral gives |/(A)| < cnX~ n for pos¬ 
itive A, thus proving the proposition. 

The next two assertions are limited to dimension one, where we can ob¬ 
tain more precise conclusions with simpler hypotheses. In this situation 
it is appropriate to consider first the integral I\ given by 

⑺ 7i(A)= f e iX ^ {x) dx, 

J a 

where a and b are any real numbers. Thus in (7) there is no amplitude 
xp present, (or put another way, ^(x) = X(a,b){ x ))- Here we assume only 
that 少 is of class C 2 , and ^ f (x) is monotonic (increasing or decreasing), 
while I 伞 ’(i)| > 1 in the interval [a, b]. 

Proposition 2.2 In the above situation, |/i(A)| < cA _1 ; all A > 0 ， with 
c = 3. 


What is important here is not the specific value of c, but that it is 
independent of the length of the interval [a, b]. Note that the order of 
decrease in A cannot be improved, as the simple example = x, and 
7i(A) = j\( etXb ~ e lXa ) shows. 

Proof, The proof uses the operator L that occurred in the previous 
proposition. We may assume 少 ’ > 0 on [a, 6], because the case when 
少 ’ < 0 follows by taking complex conjugates. So L 二 an d 

Lt (f) = — 忐差 (// 伞 hence 

I\ (A) = f L(e i ^) dx = 


.6 


e iX ^L l {l)dx^ 


e 






a 
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and now (because we do not have an amplitude ^ that vanishes at the 
end-points) there are boundary terms. Since | 少 ’㈤ 丨 > 1， these two terms 
contribute a total majorized by 2/A. But the integral on the right-hand 
side is clearly bounded by 

6 i r b 

奶 1)| dx = j j 

However is monotonic and continuous while |$ / (x)| >l,so^(l/^) 
does not change sign in the interval [a, b]. Therefore 





Altogether then |/i(A)| < 3/A and the proposition is proved. 


Remark. If in the above proposition we assumed that > /i (in¬ 
stead of > 1), then we could get |/i(A)| < c(A/i) _1 . This is obvi¬ 

ous on replacing $ by $///, and A by A/i in the proposition. 

Next we ask what happens to I\{X) when = 0 for some xo, if 

we make the assumption that the critical point xq is non-degenerate 
in the sense that 少 "(: To) ^ 0. A good indication of what we may expect 
comes from the case ^(x) =x 2 (where the critical point is the origin). 
Here one has 

J e' lXx2/ ip(x) dx = coA - " 2 + 0(|A 「 3 / 2 )， as A — oo, 
and more generally 

⑻ f e iXx2 ^P{x)dx = ^2c k \- l/2 ~ k -^O (|A|- 3/2 - n ), 

J k=0 

for every TV > 0. To see (8) we start with the formula for the Fourier 
transform of the Gaussian that states 

f e_ nsx2 xp(x)dx = s_ 1/2 f 

Jr Jr 

Now since both sides have analytic continuations for Re(5) > 0, the pass¬ 
ing to the limit, s = —i\ 卜 yields 

f e iXx ^(x) dx = (^ 1/2 j e_W 2 〜 (0 炎 . 
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So the expansion e lv,2 = ^2k=o ('!) + 0( 卜 | 2N+2 ) gives us ⑻ with Ck = 

(z7r) 1 ! 2 2 2 k kl (0). This indicates that a decrease of order 0(A—" 2 ) can 

be expected when the phase has a critical point which is non-degenerate. 

There is a version of Proposition 2.2 for the second derivative that 
takes this observation into account: it is the following estimate of van 
der Corput. Here ^ is again supposed to be of class C 2 in the interval 
[a,6], but now we assume that > 1 throughout the interval. 

Proposition 2.3 Under the above assumptions，and with h{X) given 
by (7) we have 

(9) |/i(A)| < c’A -1 / 2 for all A > 0 ， with c ; = 8. 

Again, it is not the exact value of d that matters, but that it is indepen¬ 
dent of [a, b}. 

Proof. We may assume that ^ n (x) > 1 throughout the interval, be¬ 
cause the case ^ n (x) < —1 follows from this by taking complex conju¬ 
gates. Now > 1 implies that ^(x) is strictly increasing, so if 伞 has 

a critical point in [a, 6], it can have only one. Assume xo is such a critical 
point and break the interval [a, 6] in three sub-intervals< the first is cen¬ 
tered at Xq and is [xq — 5, Xq + S] with S chosen momentarily. The other 
two make up the complement and are [a, Xo — S] and [xo -f 5,6]. Now the 
first interval has length 25, so trivially the integral taken over that in¬ 
terval contributes at most 26. On the interval [xo -f 5, b] we observe that 
> S (because > 1) and so by Proposition 2.2 and the remark 
that follows it, the integral contributes at most 3/(5A); similarly for the 
interval [a ， o：o — 占 ]. Thus altogether I\(X) is majorized by 26 -f 6 / (5A), 
and upon choosing S = A 一 we get (9). Note that if 少 has no critical 
points in [a, b] and / or one of the three intervals is smaller than indicated, 
then each of the estimates holds a fortiori, and hence also the conclusion. 

There is a similar conclusion when an amplitude ^ is present. We 
suppose ^ is of class C 1 in the interval [a, b]. 


Corollary 2.4 Assume 屯 satisfies the hypotheses of Proposition 2.3. 
Then 


( 10 ) 


■6 


e lX ^^ x ^(x) dx 


a 




- 1/2 


I f W{x)\ dx + |^(6) 


where c^p = S 


a 
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Proof. Let J(x) = f a e tX ^^ du. We integrate by parts，using J(a) 
0. Then 






e lX ^^^{x) dx 


>6 


a 


a 


J(x)~ dx -h J(6)^(t), 
dx 


and the result follows，because |J(x)| < 8A -1 / 2 for each x, by the propo¬ 
sition. 

As an illustration, we give a quick proof of the Bessel function estimate 

(11) J m (r) = 0(r~ 1/2 ) as r — oo 


when m is a fixed integer. We have (see, for instance, Section 4 in 
Chapter 6, Book I) that 

1 C 2ir 

Jm{r) = — / e irsinx e- imx dx. 

2 丌 Jo 

Here A = r, = sinx, and ^(x) = 2 n e ~ lTnx - Now break the interval 
[0, 2n] into two parts，according to whether | sinx| > l/\/2 or | cos x | > 
l/V^ - The first part consists of two sub-intervals to which we may 
apply the corollary, giving a contribution of 0(r _1 / 2 ). The second part 
is the sum of three sub-intervals to which one can apply a version of 
Proposition 2.2 (analogous to the corollary), and this gives a contribution 
of 0(r _1 ) = 0(r—" 2 )，as r — oo. 


In dimension d greater than 1， the fact is that there are no analogs of 
the strict estimates given by Propositions 2.2 and 2.3. However, there is 
a workable version of the second derivative test of Proposition 2.3 that 
can be established. We now take this up and then apply it below. 

We consider phase and amplitude functions 少 and ^ that are C°° and 
we suppose that ^ has compact support. We form the d x d Hessian 


matrix of 少 ， given by 


d 2 ^ \ 

dx ^ d ~^ i i<j,k<d 


and abbreviated as V 2 $. 


The main assumption will be that 


(12) det{V 2 ^} ^ 0 on the support of xp. 


Proposition 2.5 Suppose (12) holds. Then 

(13) /(A) = f e lX ^^ / ip(x) dx = 0(A _d / 2 )， as X — oo. 

JR d 
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We estimate /(A) via |/(A)| 2 = 7(A)/(A). This simple trick allows us to 
bring in the Hessian of $ (that is, second derivatives) in terms of first 
derivatives of differences of 少 ， an idea that has many variants. 

Before we exploit this artifice we must take a precaution: we will 
assume that the support of xp is sufficiently small, in particular, that it 
lies in a ball of fixed radius e, where e will be chosen in terms of 少 . Once 
the estimate (13) has been proved for such 也 we can obtain (13) for 
general ^ as a finite sum of these estimates, by using a partition of unity 
to cover the support of the original *0. 

Now 

I(X)I(X) = f ( dx dy. 

jR d JR d 

Here we make the change of variables y = x -\-u (with x fixed), that is, 
u = y — x. Then the double integral becomes 

[[e iX[ ^ x ^ u) ~^ x)] ^{x,u) dxdu, 

JRdJRd 

where ^(x.u) = ^(x + ^^(x) is a C°° function of compact support. No¬ 
tice that ^(x^u) is supported where |^| < 2e, since both x and y are 
restricted to range in the same ball of radius e. Therefore we have 
l^( A )| 2 = fud Jx(u) du, where 

J\(u) = ( u) dx. 

JR d 


We claim that 

(14) |^(^)| < c n (X\u\)~ n , for every N >Q. 

This is in the spirit of Proposition 2.1, and the proof of (14) follows the 
approach of that proposition. 

We use the vector field 

L = i(a-V) 

%x 

and its transpose L l given by I/(f) = —. ( a /)* Here 

V x (^(x - u) — ^(x)) — b 

a= \V x {^(x + u) - ^(x))\ 2 = W 

with b = V x (^(x - u) — $(x)). 
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We have 


(15) |6 卜 \V x {^{x + i/)- 吵 ))| 叫 4 

if \u\ is sufficiently small, in particular if |^| < 2e. 3 

The upper estimate |6| < \u\ is clear since $ is smooth. For the 
lower estimate observe that by Taylor’s theorem, V x ($(x + u) —$($)) = 
V 2 $(x) - u + 0(\u\ 2 ). However our assumption (12) means that the lin¬ 
ear transformation represented by V 2 <^(x) is invertible, so | V 2 $(x) - u\ > 
c\u\ for some c > 0. Therefore (15) is established if e has been taken small 
enough. Observe also that \d^b\ < c a |i/|，for all a, and hence, using (15) 
we see that 


(16) \dx a \ ^ for all a, 

and as a result < cjv ( 入卜 |)—# for every positive inte¬ 

ger N. 

However, 

J\(u) = ( L n u) dx 

JR d 

= f e ^[$(x+n)-<^(x)] dx ^ 

JR d 

and thus by (16), we have \J\(u)\ < c ； v(A 卜 |) _iV ，proving (14). 

With this estimate established we take iV 二 0, and TV = d + 1 in (14) 
and see that 



\J x (u)\du < c f 


du 

(1 + A 卜 |)a +1 


c\~ d , 


as is evident by rescaling the last integral. This proves (13) and the 
proposition. 

For later applications, it is of interest to elaborate some aspects of 
Proposition 2.5. 

(i) The conclusion requires only that $ is of class C d ^ 2 and ^ of class 
C dJrl . In fact，as the patient reader may verify, in the estimate |/(A)| < 
AX~ d ^ 2 , the bound A depends only on the C dJr2 norm of the C dJrl 
norm of 也 the lower bound for | det{V 2 $}|, and the diameter of the 
support of 


Here we use the notation X and X « y to denote the fact that X < cY and 
Y < X < cY respectively, for appropriate constants c. 
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Similarly, the bound Cm appearing in Proposition 2.1 depends only on 
the C N+1 norm of the C N norm of a lower bound for |V$|, and 
the diameter of the support of 

(ii) There is a version of Proposition 2.5 in which we assume only that 
the rank of the Hessian of $ is greater than or equal to m, 0 < m < d, 
on the support of In that case the conclusion is 

(17) /(A) = 0(A- m/2 ), £LS A —> OO. 

This may be deduced from the case m 二 d, already established. One 
proceeds as follows. For each x°, the symmetric matrix V 2 $(x°) can 
be diagonalized by introducing (via a rotation) a new coordinate sys¬ 
tem x = (x f , x n ) E M m x R^ _m , so that V 2 $(x°), when restricted to 
R m , has a non-vanishing determinant. Hence for a small open ball B 
centered at x°, the same is true for V 2 $(x) when x G B. Now for 
each fixed x n E R. d ~ m we use the proposition (where d = m) to obtain 
e lX ^{ x， < ^4A _m / 2 , with supported in B. Af¬ 
ter integrating in x n and summing over finitely many such balls that 
cover the support of 也 we obtain (17). 

3 Fourier transform of surface-carried measures 

We will now study surface-carried measures and their Fourier transforms. 
Our goal is a generalization of the estimate (5), which we had seen in 
the case of the sphere. 

Recall from Section 4 of the previous chapter that given a point x° on 
a C°° hypersurface 4 M we dealt with a new coordinate system centered 
at x° (given via a translation and rotation of the initial coordinates), 
written as x = (x\xd) € M d_1 x R, so that in a ball centered at x°, the 
surface M is represented as 

(18) M = {(x\x d ) e B : x d = 

where B is the corresponding ball centered at the origin. We can also 
arrange matters so that the function which is C°°, satisfies p(0) 二 0, 
and ▽? 咖 ’)|? =0 . 

Now this representation gives a defining function p\ of M, with p\ (x)= 
— Xd. Among the various possible defining functions of M near x°, 
we now choose one, p, which is normalized by the condition |Vp| = 1 
on M. This can be achieved by setting p = Pi/| Vpi| near M. With such 


4 The thrust of the C°° requirement is that M is of class C k for sufficiently large fc; 
later we will be more specific about how large k must be taken. 
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a normalized defining function, the curvature form of M at x 6 M 
(also known as the second fundamental form) is the form 


(19) 


Y<k ， j<d 


d 2 p 

dxkdxj 



restricted to vectors that are tangent to M at x. The reader 

might note here the parallel between the curvature just described in terms 
of a quadratic form given by the defining function, and its complex analog 
(the Levi form) that was important in the previous chapter. 

It is straightforward to verify that this form does not depend on the 
choice of a normalized defining function. 

Now reverting to (18) and using = 0 we see that 


咖 ’) 


akjx k xj + o (| x ’ i 3 )， 

l<k,j<d-l 


and the curvature form is represented by the (d — 1) x (d — 1) matrix 

二 i a kj}^ 1 S < d — h Now if we make an appropriate 

rotation in the x 1 G R d-1 space and relabel the coordinates accordingly 
we have 


d 2 ip 


1 d ~ l 
i =1 

The eigenvalues \j are called the principal curvatures of M (at x°) and 
their product (the determinant of the matrix) is the total curvature or 
Gauss curvature of M. 5 

Notice that there is an implicit choice of signs (or “orientation”）that 
has been made. The signs of the principal curvatures can be reversed if 
we use —p instead of p as the defining function of M. 

We mention briefly several examples. 

Example 1. The unit sphere in W d . If we start with pi = \x\ 2 — 1 
as a defining function, then p= is “normalized.” All the principal 
curvatures are equal to 1. 

Example 2. The parabolic hyperboloid {x^ — x\ — x\\ in R 3 . This 
hypersurface has non-vanishing principal curvatures of opposite sign at 
each point. 


5 There is a neat geometric interpretation of the Gauss curvature in terms of “Gauss 
map,” see Problem 1. 
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Example 3. The circular cone {x^ = \^\ 2 ^ Xd ^ 0} in R d . This hy¬ 
persurface has d — 2 identical non-vanishing principal curvatures at each 
point. The calculations involved are outlined in Exercise 9. 

Next we consider the induced Lebesgue measure on M, the measure da 
that has the following property: for any continuous function f on M with 
compact support 


/ f da = lim / F dx. 

Here F is a continuous extension of / into a neighborhood of M and 
{x : d(x, M) < e} is the “collar” of points at distance < e from M• Now, 
as is well-known (see also Exercise 8), in our coordinate system da = 
(1 + \V x f(f\ 2 ) 1 ^ 2 dx 1 , in the sense that 

(20) f f da = f f(x\ ^(^))(1 + I V x /(^| 2 ) 1/2 dx'. 

jm 加 -i 

With this we can say that a measure dfi is a surface-carried measure 
on M with smooth density if d/i is of the form dfi = where is 
a C°° function of compact support. 

We now have all the ingredients necessary to state the main result 
concerning the Fourier transform of dfi defined by 

MO= \ dfi. 

JM 


Note that is bounded on R d since the measure d" is finite. 

Theorem 3.1 Suppose the hypersurface M has non-vanishing Gauss 
curvature at each point of the support of dfi. Then 

(21) 1^(01 = 0(|《|”/ 2 )财 I 《卜 oo . 

Corollary 3.2 If M has at least m non-vanishing principal curvatures 
at each point of the support of dfi, then 

\d^(0\ = 0(|^r m/2 ) a5 |^| ^ (X). 

First some preliminary remarks. We can assume that the support 
of # is centered in a sufficiently small ball (so that in particular the 
representation (18) of M holds in it), because we can always write a 
given as a finite sum of 也 of that type. Next, all our estimates can 
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be made in the coordinate system used in (18) since the transformations 
of the : r-space used in that change of coordinates involves only a 

translation and a rotation. The Fourier transform then undergoes 

a multiplication by a factor of absolute value 1 (a character) and the 
same rotation in the ^ variable. Thus the estimate (21) is unchanged. 
Now because of (20) we have 

( 22 ) 5 ^( 0 = [ e- 2ni{x ， ^ ， + ^ x ， ) ^ d) ^{x , )dx\ 

with ^ == (^, ^) E and 众 the C°° function with compact support 
given by 

6<y )= 棒 ’， w<y))(i + |v x /^| 2 ) 1/2 . 

We divide the ^ space into two parts: the “critical” region, the cone 
\^d\ > c| 之 ’I, where c can be taken to be any fixed positive constant; and 
the subsidiary region, |^| < c|《’|，but here we need to assume that in 
fact c is small. 

In the first region we may suppose that ^ is positive, since the case 
when is negative follows by complex conjugation, or can be done 
similarly, and we write the exponent in the Fourier transform as 

with the choice of A = 2 兀匕 ， and $(x’) = —(f(x f ) - Observe that 

= —V^/(^, and hence if the support of ^ is sufficiently small (which 
means we are sufficiently close to x°), the determinant of the Hessian of 少 
is non-vanishing. This is because of the corresponding property of (f that 
represents the non-vanishing of the curvature of M. Note also that 少 
has, for any fixed TV, a C N norm that is uniformly bounded as 之 ranges 
over the set \^d\ ^ c|^|. We can now apply Proposition 2.5 (with R d_1 
in place of R d ) and get 

\M0\ = o(\-^) = 0(^；^) = od^r^), 

since here |^| > c|^|. 

In the complementary region |^| < c|《’| we write A = 2 丌 |《’|, and $($’）= 

― 咖 ’) 氣 — Note that (^^) | = 1， while j|r||Vx^| < 1/2 

if c is so small that c|V x /p| <1/2 throughout the support of 功 . So if 
we invoke Proposition 2.1，the fact that |> 1/2 yields for each 
positive TV, 

I 品 (01 = 0(A-。= 0(|^r N ) = O ⑽-。 
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when ^ is in the second region. Taking N > completes the proof of 
the theorem. 

The corollary can be proved by the same argument if one uses the 
estimate (17) instead of (13). 

Suppose that is a bounded region whose boundary M = satisfies 
the hypothesis of Theorem 3.1. If xq is the characteristic function of fi, 
then its Fourier transform has a decay that is one order better than that 
of the corresponding surface-carried measure on its boundary. 

Corollary 3.3 If M = dfl has non-vanishing Gauss curvature at each 
point, then 

Xn(0 = as |^| ^ (X). 

Proof. Using an appropriate partition of unity we can write 

N 

Xn = 

with each a C°° function of compact support; is supported in 
the interior of fi, while each 也 ， I < j < N, is supported in a small 
neighborhood of the boundary in which the boundary is given as (18). 
Now since = * 00 , it is clear that ( / ipoX^) A ls rapidly decreasing. 

Next consider any ( 也 Xfi ) 八 for 1 S j S N• In analogy with (22), this 
has the form 


/ e - 2 4'0 ‘) 也仏 匕） 如 , 咖， 

j x d >(p(x , ) 

which is, after changing variables so that Xd = u 
(23) f e~ 2 ni{x ， < ， ^ x ， ) ^^{x , ^ d )dx , 

where 少 (: = / 0 °° e~ 27Ttu ^ d/ ipj(x , , u -|- du. Note that ^(x\^d) 

is a C°° function in x' of compact support, uniformly in When 
1^1 < c|d，the argument proceeds as before, giving an estimate 0(|^| _J?V ) 
for each > 0. To deal with the situation when |^| ^ c|^| write 

少 (/ 乂 d) = J ^-(e _27r ^ d )^(^, u + ip(x)) du. 

and integrate by parts, giving us an additional decay of 0(1/|^|)= 
0(1/|^|) in (23). This proves the corollary. 
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Remark. In view of the comments following the proof of Proposition 2.5, 
we see that the results of this section hold if the C°° assumption we made 
about M is replaced by the requirement that M is only of class C d+2 . 

4 Return to the averaging operator 

We consider here a more general averaging operator. Given a hyper¬ 
surface M in R d and a surface-carried measure dfi = ^da with smooth 
density of compact support, we set 

(24) Mf)(x)= f f{x-y) dfi{y). 6 

J M 

We shall prove that under the proper assumptions on M, the operator A 
regularizes / as a mapping from L 2 (R d ) to L^(R d ), and in addition that 
it “improves” / in the sense that it takes L p (R d ) to L 9 (R 9 ), for some 
g > p, if 1 < p < oo. 

Theorem 4.1 Suppose the Gauss curvature is non-vanishing at each 
point x G M in the support of dfi. Then 

(a) The map A given by (24) takes L 2 (R d ) to L^(R d )， with k = 

(b) The map extends to a bounded linear transformation from L p (R d ) 
to L q (R d ) with p = and q = d 1. 

Corollary 4.2 The Riesz diagram (see Section 2 in Chapter 2) of the 
map A is the closed triangle in the (1/p, 1/q) plane whose vertices are 
(0,0 )， (1,1) and (^ 1 ， ^+t). 

In fact, the L p , L q boundedness asserted in this corollary is optimal，as 
is seen in Exercise 6. 

Corollary 4.3 If we only assume that M has at least m non-vanishing 
principal curvatures，then the same conclusions hold with k = m/2, and 

器 q = m + 2. 

The proof of part (a) in the theorem is the same as that for the sphere 
once we invoke the decay (21), which implies that (1 + | 《 | 2 ) fc / 2 d/i(《）is 
bounded. Hence 

imk/)Ul 广 ii(i + i<ei 2 ) fc/2 ^(oiiL 2 

= ii(i + i《i 2 ) fc/2 /(o 而 (0"L2 

< c||/||l2 =c||/|| L 2. 


6 Here we have omitted a normalizing factor in the definition of A, since the density -0 
is not necessarily positive. 
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Figure 1 . Riesz diagram of the map A in Corollary 4.2 


The proof of part (b) combines two aspects of the operator A via in¬ 
terpolation, somewhat akin to the proof of the Hausdorff-Young theorem 
in Section 2 of Chapter 2. First, there is an L 1 —>• L°° estimate. The in¬ 
equality involved is merely one of size, involving only the absolute value 
of our functions, but in order to get to it we have to “improve” the op¬ 
erator A by “integrating” it (of order 1). This estimate does not depend 
on the curvature of M• 

Second, there is an L 2 — L 2 estimate. It comes, like part (a) of the 
theorem, via Plancherel’s theorem together with Theorem 3,1， and it 
allows us to “worsen” the operator A by essentially “differentiating” of 
degree The operator intermediate between the improved and the 

worsened operators is A itself, and the resulting intermediate estimate is 
then conclusion (b). 

The scheme of the proof we have outlined in fact occurs in a number 
of situations. To carry it out we need a version of the Riesz interpola¬ 
tion theorem in which the operator in question is allowed to vary. The 
proper framework for this is an analytic family of operators defined 
as follows. 7 

For each s in the strip 5 = {a < Re(5) < b} we assume we are given 
a linear mapping T s taking simple functions on W d to functions on M d 
that are locally integrable. We also suppose that for any pair of simple 


7 Here we state the results for the space with Lebesgue measure. The same ideas 
can be carried over to the setting of more general measure spaces as in Theorem 2.1 in 
Chapter 2. 
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functions / and the function 

^o(^) = [ T s (f)gdx 
jR d 

is continuous and bounded in S and analytic in the interior of S. We 
further assume the two boundary estimates 

sup||T a+ ^(/)|| Lgo < MoII/Ulpo, 

and 


sup ||T 6+ ^(/)|| Lgi < Mi||/|| LP1 . 

Proposition 4.4 With the above assumptions, 

||T c || Lg <M||/|| lp , 

for any c with a < c<b y where c = (1 — 9)a + 9b and 0 < 9 < l; and 

1 1 - 9 9 1 1 1 - e e 

一 —- + — CLTld 一 — - + — . 

p Po Pi q qo qi 

Once we have formulated this result, we in fact observe that we can prove 
it by essentially the same argument as in Section 2 in Chapter 2 . 

We write s = a(l — z) -bz } so z = ， and the strip S is thereby 

transformed into the strip 0 < Re( 2 ：) < 1. For / and g given simple 

functions, we write fs = l/l 7 (s) j/| and 9s = l 々 l 6 ⑷ ]ff where we define 

7 ( 5 ) = p + 吉 )， and 5 ( 5 ) = q + 寺 ) ■ We then check that 

^{s) = [ T s (f s )g s dx 
JR d 

is continuous and bounded in the strip S and analytic in the interior. 
We then apply the three-lines lemma to 少 ( 5 ) and obtain the desired 
conclusion as in the proof of Theorem 2.1 in Chapter 2. 

Returning to the averaging operator A, we shall assume (as we may) 
that the support of d" has been chosen to lie in a ball for which M is 
given in coordinates by (18). 

Now the operators T s we will consider are convolution operators 


T s = f^K s 
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defined initially for Re(5) > 0, with 

(25) K s = ^ s \x d - 咖 ’)|;- Vo(:r). 

The following explains the several terms appearing in the definition of K s . 

2 

• The factor 7 S equals 5(5 + 1) • • • (5 + N)e s . 

The purpose of the product 5(5 + 1) • • • (5 + A^) will be clear mo- 

2 

mentarily, and the factor e s is there to mitigate the growth of that 
polynomial as Im(5) — oo. Here N is fixed with N > 

• The function | 以 I^T 1 equals u s_1 when u > 0 and equals 0 when 
u <0. 

• ^ 0 (^) — ^(1)(1 + |▽x ,( ^(〆)l 2 ) 1 / 2 , with -0 the density of d/i = ^da. 

We note first that when Re(5) > 0, the function K s is integrable over R d . 
Our main claim is then the following. 

A 

Proposition 4.5 The Fourier transform K s (^) is analytically continu- 
able into the half-plane —< Ke(s) and satisfies 

(26) sup I 女 s (《）| < M in the strip < Re(5) < 1. 

This is based on the following one-dimensional Fourier transform calcu¬ 
lation. We suppose that F is a C°° function on R with compact support, 
and let 

pOO 

(27) I s (p) = s(s + l[.(s + N) u s - l F{u)e~ 2niup du, p G M. 

Jo 

Lemma 4.6 I s {p) initially given above for Re(5) > 0 ， has an analytic 
continuation into the half-space Re(s) > —TV — 1. Also 

(a) |/ s (p)| < c s (l + lp|)- Re ⑷， when —N — 1 < Re(s) < 1. 

(b) I o (p) = N\F(0). 

Here c s is at most of polynomial growth in Im(5) and it depends only on 
the C N+1 norm of F and the support of F. 

The reader should note that when when p 二 0, we are dealing with the 
analytic continuation of a homogenous distribution, much in the 

same way as in Section 2.2 in Chapter 3. 
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Proof. Write s(s + 1) • • • (s + N)u s ~ l = (^) N+1 u s+N . Then an 
(TV + l)-fold integration by parts yields 

广 oo / , \ N+l 

Is(p) = (-l) N+1 y U s+N ( — ) (F(u)e~ 2 ^)du, 

from which the analytic continuation of I s to the half-space Re(s) > 
—N — 1 is evident. It also proves the estimate (a) when p is bounded, 
for example when |p| < 1. 

The proof of the size estimate (a) when |p| > 1 is similar but requires 
a little more care. We break the range of integration in (27) into two 
parts, essentially according to u\p\ < 1 or u\p\ > 1. We suppose 77 is a C°° 
cut-off function on R with rj[u) = 1 if |^| < 1/2, and r](u) = 0 if \u\ > 1, 
and insert r](up) or 1 — r](up) in the integral (27). 

When we insert rj(up) we write the resulting integral as 

roo / , \ N+l 

(-l) N+l y u s+N f {rj{up)e- 27Tiup F{u))du, 

and so it is dominated by a constant multiple of 

(1 + |p|) N+1 [ u a+N du, with a = Re(s). 

Since cr + TV > — 1 this quantity is itself dominated by the product (1 + 
|p|) N+1 |p| _cr_iV_1 , which is < (1 + |p|) _cr , since we have assumed |p| > 1. 
When we insert 1 — r](up) we write the resulting integral as 

♦ + 1) … ( s + u^F(n)(l- V (np))(^j (e~^) du 


where k is chosen so that Re(s) < k. Then, except for a factor that does 
not depend on p (and is a polynomial in s), the integral equals 



— 2TTiup 


d_ 

du 


k 


— Tj(pu))} du. 


Since F has support in some interval |^| < A, it is easily verified that 


the above is dominated by a multiple of p 


k 


: A 
l/(2|p|) 


U 


cr — k — l 


du, which is 


0(p _cr ), because a = Re(s) < fc; that yields the bound required in (a). 
Finally, the integration by parts we have used also shows that 


以 p) = —( S + 1)...( S + 7V) 


,S 


d 


— {F{u)e 


■27T2 


du 


Up )du, 
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so setting s = 0 gives conclusion (b), since F(0) is equal to the integral 
— Jn(^( u ) e ~ 27TtUP ) The lemma is therefore proved. 

We return to Proposition 4.5. Looking back at (25) we see that when 
Re(s) > 0, making the change of variables u = Xd — yields 


^s(0 = 7s / \xd - (f{x , )\ s ^ l xp 0 {x)e~ 2 ^ l{x，<， ^ XdU) dx 
JR d 


(28) = 7s 


with 


u s-i e -2^iuU 


e 






^o{ x \ u + dx’ du 


e s Is(U) 


F(u) = ( e~ 2 ^ l ^ x， u + ^>{x)) dx\ 

JRd-^ 


in the formula (27) for I s . 

However, by Theorem 3.1 (essentially the estimates we have for the 
integrals in (22)) it follows that \F(u)\ < c(l + |^|) - ^ , with the same 
order of decay in |^| for any derivative of F with respect to u. Therefore 
by conclusion (a) of the lemma we get that 


\K S (0\ < c s \e s2 \(l^ |^|)- Re(s) (l + 

which yields (26). Note that in the strip —< Re(s) < 1, we have 

|e s I < ce _ ( Im ( s )) and c s is at most of polynomial growth in Im(5). 
Proposition 4.5 is therefore proved. 


We now return to the operators T s and apply our analysis of the ker¬ 
nels K s . 

Suppose / and g are a pair of simple functions on R d . The fact that 
these are in L 2 allows us to use the Fourier transform and PlancherePs 
theorem. So if we set $ 0 (s) = f T s (f)g dx for Re(s) > 0, then 

_= [ (f^K s )gdx= f (f^K s ) A g(-Od^ 

JR d JR d 


r R d 


K s (0 f (09(-0 


So the proposition and Schwarz’s inequality show that 少 o( s ) is contin¬ 
uous and bounded on the strip — < Ke(s) < 1 and analytic in the 
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interior. It is also apparent by the proposition that 

sup||T_^i +i£ (/)|| L 2 < M||/|| L 2 . 
t 2 

Next, clearly sup x \K s (x)\ < M, for Re ⑷ =1. Thus 

sup||T 1+ „(/)|| L oc <M||/|| l1 . 

t 

八 - - 

However, by (28) and conclusion (b) of the lemma, Xo(0 — and 

thus 


To(f) = N\A(f). 


We can therefore apply the interpolation theorem, Proposition 4.4. Here 
we have a = —6=1, and c = 0. Also po = qo = 2^ = 1, q x = 

oc. However 0 — (1 — 6)a -f 9b so 0 = Since 1/p = + 0, we 

get 1/p = similarly l/q = giving us the desired result for the 
operator A. 


5 Restriction theorems 

We come to a second significant application of oscillatory integrals. Here 
we focus on the possibility of restricting the Fourier transform of a func¬ 
tion to a lower dimensional surface. The background for this is as follows. 

5.1 Radial functions 

A 

To start with, the Fourier transform / of an L l function is continuous 
(see Section 4* in Chapter 2, Book III) while by the Hausdorff-Young 
theorem, / belongs to L q if f G L p for 1 < p < 2, and 1/g -h 1/p = 1. 
Now L q functions are in general determined only almost everywhere. 
Thus (without further examination) this suggests that the Fourier trans¬ 
form of an L p function, 1 < p < 2, cannot in general be meaningfully 
defined on a lower dimensional subset, and this is indeed the case when 

p = 2. 

The first hint that things might in fact be quite different is the obser¬ 
vation that for certain p, 1 < p < 2, whenever / is radial and in L p and 
d>2, then its Fourier transform is continuous away from the origin. 

Proposition 5.1 Suppose f E L p (R. d ) is a radial function. Then f is 
continuous for ^ ^ 0 whenever 1 <p < 2d/ (d + 1). 
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Note the sequence of exponents ( 二 ) : 1 ， 3 ， 2 ， 5 
d —> oo. 


4 3 8 


… that tends to 2 as 


Proof. Suppose f(x) = fo(\x\). Then / ⑹二 F(|^|) with F defined 
by (4), namely, 


(29) 


F{p) = 27rp _d/2+1 / Jd/ 2 -i{^pr)fo{r)r d/2 dr. 


We can make the simplifying assumption that / vanishes in the unit 
ball (thus the integral above is taken for r > 1) because an L p function 
supported in a ball is automatically in L l and its Fourier transform is 
then continuous. 

We also restrict p 二 |《| to a bounded interval excluding the origin, and 
note that then the integral in (29) converges absolutely and uniformly 
in p. In fact the integral is dominated by a constant multiple of 


(30) 


\fo{r)\r d/2 ~ 1/2 dr, 


since |J^/ 2 -i(^)| ^ Au~ 1 ^ 2 if ix > 0, as we have already seen. Now let q 
be the exponent dual to p, (1/p 1/q = 1), and write 


r 


d/2-1/2 


d-1 d-1 


d-l 


T v T ^ T 2 • 


Then by Holder^ inequality the integral (30) is majorized by the product 
of an L p and an L q norm. The L p factor is 


i/p 


|/o(r)| p r d_1 dr 


c 


ll/ll/>(R d )， 


while the second factor is 


r 




i/9 


and this is finite if d — 1 — ^ ( which means q > 2d/(d — 1 )， 

and thus p < 2d/(d + 1). The asserted convergence of (30) therefore 
proves the continuity in p of F in (29) and establishes the proposition. 

An examination of the proof shows the range 1 < p < 2d/ (d + 1) can¬ 
not be extended. 

We now turn to the question of what happens when / is not assumed 
to be radial. 
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5.2 The problem 

Let us fix a (local) hypersurface M in One can then phrase the re¬ 
striction problem for M as follows. Suppose dfi is a given surface-carried 
measure, dfi = 外 da, with smooth non-negative density xp of compact sup¬ 
port. For a given 1 < p < 2, does there exist a q (not necessarily the dual 
exponent to p) so that the a priori inequality below 

(31) (Jm 1 / ⑹ |W"(0) < c\\f\\LP(R^) 

holds? 

By this we mean the inequality (31) is to be valid for an appropriate 
dense class of functions / in L p , with the bound c independent of /. If the 
answer to the question is affirmative we say that the (L p , L q ) restriction 
holds for M. 

Here is what we can assert about this problem. 

1. Non-trivial results of the kind (31) are possible only if M has some 
degree of curvature. 

2. Suppose M has non-zero Gauss curvature at each point (in par¬ 
ticular when M is the sphere). Then one is led to guess that 
the correct range for (31) to be valid is 1 < p < 2d/(d + 1) and 
q < (^y) p f with l/p f + 1/p = 1. Note the end-points of this rela¬ 
tion, q = oo when p = 1, and q 2d/(d + 1) when p 2d/(d + 1). 
When d = 2 this guess is indeed correct; the proof is outlined in 
Problem 4. 

3. For d > 3 it is still not known whether the expected result holds, 
but an interesting part, corresponding to ^ = 2 (and hence for q > 
2) is settled. This is what we now address. 

5.3 The theorem 

Here we prove the following result. 

Theorem 5.2 Suppose M has non-zero Gauss curvature at each point 
of the support of dfi. Then the restriction inequality (31) holds for q = 2 

andp= 繫 . 

Note here that we have another sequence of exponents 璧聲： 1, 香， 
• ，号 ，…， tending to 2. 
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The proof starts with several quick observations. Let TZ denote the 
restriction operator 

nn = ho = f e- 2 ^f(x)dx ， 

M jR d M 

which is initially defined to map continuous functions / of compact sup¬ 
port on M. d to continuous functions on M. Consider also the “dual” TZ' 
mapping continuous functions F on M to continuous functions on R d , 
defined by 

7V(F)(x)= f e 2 ^ x F{i)d^i). 

J M 

We note that an interchange of integration proves the duality identity 

(32) (n(f),F) M = (f^(F)) Rd , 

where {f,g) U d = f Rd f{x)g{x)dx and {F,G) M = J M F(^)G{^) 

Now we consider the composition TZ*TZ. We have 

TVTZ(f)(x)= f e 2 —{/ e- 2 ^f{y)dy\d^i). 

J M UR d ) 

Hence 

(33) TZ*TZ(f) 二 / * /c ， with k(x) = dfi(-x). 

There is then the following relation between bounds for TZ, TZ* and 7Z*7Z, 

Proposition 5.3 For a given p with p > l, the three norm estimates 
below are equivalent: 

⑴ II 咒 (/)||l2(M, 咖） < c||/|| L p( R d). 

(ii) ||^*(F)|| LP / (Rrf) < c||F|| L2(M 刻， where 1/p + 1/p’ = 1. 

(iii) ||W(/)|| LP , 

(R d ) — c 2 ||/|| LP(R d )* 

The equivalence of (i) and (ii) follows directly from the duality of L p 
spaces and the general duality theorem (Theorem 4.1 and Proposition 5.3 
in Chapter 1). 

We assume (i) (or (ii)) then this implies (iii) once we apply (ii) with 

F = n(f). 
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Conversely, we know by (32) that 

Hence if (iii) holds, then {TZ{f),TZ(f)) M < c2 ||/lli P (Rd) b Y Holder^ in¬ 
equality. This gives (i) and the proposition is proved. 

From this proposition, we see that to establish the theorem we have 
to show that the operator TZ*TZ is bounded from L p (R. d ) to L p ， (R d ), 
with p = ^~ 3 2 . The argument is very much like that for the averaging 
operator A, except here inverted via the Fourier transform. 

In fact, here the analytic family of operators we consider is {S^} given 

by 

S s (f) = / * h ， 

A 

where k s is defined by k s (x) = K s (—x), and K s is given initially by (25), 

八 I 

and with K s extended in the strip < Re(s) < 1 by Proposition 4.5. 

Recall that K 0 (0 = 见兩 ⑹， so S 0 (f) = N\U*n{f) by (33). But 
when Re(s) = 1 

||^(/)Hl 2 <M||/|| l2 , 

since Ki^-a E L°°, and sup t ||Ki+it||L°° S M, we have 

Also, when Re(s) = —then k s G L°° by (26) in Proposition 4.5, since 
= K s (—x). Thus 

supH^d^i^.^/)^- < M||/|| L i. 

Finally, it is easily verified (again using Proposition 4.5) that 少 o(5)= 
/ R d S s (f)gdx is continuous, and is bounded in the strip < Re(^) < 

1 and analytic in the interior, whenever f and g are in L 1 (R d ) (and 
in particular when / and g are simple). We therefore can apply the 
interpolation theorem (Proposition 4.4) to S s . In this case a = — 
6=1， and c = 0, so 0 = (1 — 9)a + 9b implies that 9 = Also here 
Po = 1, go = oo, and pi = 2, q 2 = 2. 

So l/p = ^ + gives 1/p 1 — 0 + 6/2 = 1 — 6/2 and as a result 

1/P= Similarly l/q = -j- -^ = (9/2, and l/q=l - 1/p = l/p 1 . 

Therefore = N\TZ*TZ maps L p to L p and by the equivalence guaran¬ 
teed by Proposition 5.3, the theorem is proved. 
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Corollary 5.4 Under the assumptions of the theorem, the restriction 
inequality (31) holds for I < p < and q < (^y) p f ■ 

This follows by combining the critical case P = ^ < 2 (a conse¬ 

quence of the theorem and Holder^ inequality) with the trivial case 
p = 1, q = oo via the Riesz interpolation theorem. 

The key to the theorem is of course the decay of the Fourier transform 
of the surface-carried measure dfi. This is highlighted by the following 
assertion which is clear upon reexamination of the proof of the theorem. 

Here we deal with a hypersurface M, where we make no explicit as¬ 
sumptions about its curvature. The measures dfi considered will be of 
the form ^da as before. 


Corollary 5.5 Suppose that for some S > 0, we have 


- Js 

1 咖 (01= 靖 | 一 0 )， a5 |^| —oo, for all measures of the above form. 

Then the restriction property (31) holds for p = q = 2. 

In particular, if we assume M has m non-vanishing principal curva¬ 
tures, then using the corollary in Section 3, we get this conclusion for 



2m+4 

m+4 


6 Application to some dispersion equations 

Dispersion equations have, broadly speaking, the property that as time 
varies, their solutions conserve some form of mass or energy (for example, 
the L 2 norm), yet these solutions disperse, in the sense that their sup- 
norms decay as time increases. In what follows we will see how the ideas 
we have discussed in this chapter apply to some equations of this kind, 
both linear and non-linear. 


6.1 The Schrodinger equation 

Typical of linear equations of the dispersion kind is the imaginary-time 

Schrodinger equation 


(34) 


Idu 
i dt 


Aix, 


for and (x, G x R = R d+l , with its Cauchy problem of de¬ 

termining a solution of (34) with initial data /, that is, 


( 35 ) 


乜(: E ，0) = f(x). 
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Here A = is the Laplacian on M. d . 

If we proceed formally，we are led to define the operators e ltA by 

( 36 ) (e itA f) A (0 = e- it4 ^ 2 f(0, 

where A denotes the Fourier transform in the x-variable, and one expects 
that u(x ， t) = e ltA (f)(x) is the solution of the problem ( 34 ) and ( 35 ). 
That this is so can be seen in two different contexts, the first of which is 
in the setting of the Schwartz space S of testing functions. 

Proposition 6.1 For each t: 

(i) e ltA maps S to S. 

(ii) If we set u(x,f) = e itA (f)(x), with f e S，then u is a C°° function 
of (x ， t) that satisfies ( 34 ) and ( 35 ). 

(iii) e ltA (f) = f * K t , if t ^ 0, where K t {x) = (4 ： 7rit)~ d ^ 2 e~^ 2 . 

(iv) ||e^ A (/)|| L - < (4 丌 | t | 广 / 2 ||/|| l1 . 

Proof. That e ltA maps <S to <S is clear because the multiplier e -^ 47r2 l^l 2 
has the property that each derivative in ^ is of at most polynomial in¬ 
crease. Next, the Fourier inversion formula gives 

u{x,t)= f e -^ 2 ^l 2 e 2 ^/(0^. 

JR d 


The rapid decrease of / guarantees that the function u is C°° in the x 
and t variables. The fact that then u satisfies (34) is clear since the 
action of brings down a factor of —4 丌 2 |《| 2 , which is the same factor 
that results from the application of A. 

The conclusion (iii) is a consequence of the identity 

(37) 町 (0 二 e-—% 2 , t^O 

when both bounded functions Kt(x) = (4 ： 7rit)~ d ^ 2 e~^ x][2 and e _lt47r2 W 
are viewed as tempered distributions, and the usual relation between con¬ 
volutions and Fourier transforms as in Chapter 3. 

To prove (37) we start with the familiar identity for Gaussians 

(^- d/2 e _7r|:r|2 / U ) A (0 = e _mr| $ 2 ， when u > 0. 

Here we are dealing with rapidly decreasing functions and the Fourier 
transform is taken in ， say, the L 1 sense. We now write u = 4 丌 5, and we 
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extend the above identity by analytic continuation to complex s = a it^ 
with cr > 0, since the functions in question are still rapidly decreasing ‘ 
Thus 


((4 丌 5 


_ ^/ 2 g — 1^1 


7(乜)) 


A 


-47T 2 s|C| 2 


e 


Finally, if t is fixed, t / 0, then letting cr ^ 0, the functions on the left- 

hand side and right-hand side converge pointwise and boundedly (and 

2 2 

hence in the sense of tempered distributions) to K^(^) and e _lt47r ^ , 
respectively. Therefore (37) is established. Finally 


/* K t \\ L oo < ， || L oo||/|| L1 = (4 卟 I 广 / 2 ||/|| l1 ， 


and the proposition is proved. 

We look again at the operator e ltA given by (36)，but now in the 
context of L 2 , 


Proposition 6.2 For each t: 

(i) The operator e ltA is unitary on L 2 (R d ). 

(ii) For every f，the mapping 1 1 —> e ltA (f) is continuous in the L 2 (R d ) 
norm. 

(iii) If f G L 2 (R d )，then u(x^ t) = e ltA (f)(x) satisfies (34) in the sense 
of distributions. 

Proof. Conclusion (i^ is immediate from Plancherel’s theorem, since 
the multiplier e _lt47r2 l;l has absolute value one. Now if / E L 2 (R d ), 
then clearly e~ M7r2 ^ 2 f(^) e~ tto47v2 ^ 2 f(^) in the L 2 (M d ) norm when 

t ^ to, so (ii) follows again from PlancherePs theorem. 

To prove the third conclusion we use the short-hand £ = f 羞— △， 
and = — I ^ — A for its transpose. Conclusion (iii) asserts that when¬ 
ever p is a C°° function on x R of compact support, then 


(38) 


// 

J jRd 


C f {ip){x^ t)(e ltA f)(x) dx dt = 0. 


xR 


Now if / E <S, then (38) holds for such /, because then u(x ， t) = e ltA (/)(x) 
satisfies C(u) = 0 in the usual sense, as we have seen. For general / E L 2 , 
approximate / in L 2 (R d ) by a sequence {f n } with f n E S. Then be¬ 
cause of conclusion (i) we may pass to the limit and obtain (38) for any 
/ E L 2 (M d ), finishing the proof of the proposition. 
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We remark that the decay estimate (iv) in the first proposition can be 
extended to read 

(39) ||^ A /|| L , ( Rd) < c p | ； r d ( 1/p_1/2 )ll/llL _)， 

if 1/q + 1/p = 1， and 1 < p < 2, with c p = ( 斗兀 ) - # 1 /。 - " 2 ). This in fact is 
a direct consequence of the Riesz interpolation theorem (see Theorem 2.1 
in Chapter 2) when we combine the cases corresponding to p = 1 and p = 
2, in the propositions above. Another way to see (39) is to realize that 
the operator e ltA is a disguised version of a rescaled Fourier transform, 
and thus (39) is a restatement of the Hausdorff-Young theorem. This is 
outlined in Exercise 12. 

Now the decay estimates (39) raise the question whether one can see 
any decrease for large time, when the initial data is merely assumed to 
be in L 2 . Given the unitarity of e ztA , the best one can hope for is an 
overall, or average, decay in both x and t. Thus one is led to ask whether 
an estimate of the kind 

(40) ||^(^^)||L<?(Rd x R) ^ c ll/"L 2 (R d ) 
is possible (say for q < oo). 

By a simple scaling argument we can see that (40) can hold only with 
the exponent q = Indeed, if u(x ， t) = replace / by fs 

where fs{x) = /(5x), and u by 抑 ， with us(x^ t) = u(6x^ S 2 t) } and 8 > 0. 
Then us is a solution of (34) with corresponding initial data fs. That is, 
us(x^t) = e ltA (fs)(x). Thus if (40) held, we would have | 卜 < 5||L*?(R d + 1 ) 幺 
c||/ 5 || L 2 ( R d), for all 5 > 0, with c independent of 6. But ||/(5||L 2 (R d )= 

. _ d + 2 _ d+2 

占 / II/IIl 2 (r 勺， while ||^|| Lg (Rd+i) = 8 __ 厂 | 卜 || Lg(R d+i )， and so S~ < 
c / S~ d ^ 2 for all 5 > 0, which is possible only when that is, q = 

2d+4 

d • 

One should notice that q = is exactly the (dual) exponent arising 
in the restriction result in Theorem 5.2 (that is, 1/p 1/q = 1, with 

p = when we are in R d+1 instead of R d , This is no accident as we 

will now see. 

Theorem 6.3 If u(x ， t) = e ltA (f)(x) with f E L 2 (R d ) ; then (40) holds 
when q =• 

Results of this kind are called Strichartz estimates. We will see that 
in fact this theorem is a direct consequence of the results in Section 5. 

We consider the Fourier transform now on the space R d+1 = x R = 
{(x,Xd + i)}, relabeling the variable t as In the corresponding dual 
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space (also R d+1 ) the dual variables are denoted by ( 《乂 d+i)，with ^ 
dual to x and Q+i dual to x^i- In this dual space we take M to be the 
paraboloid given by 

M = {(K d+ i) : = -27r |《| 2 }， 

where 1 2 = g + … .+3. 

On M we define the non-negative measure dfi = ^pda = 咖炎 ， where 
is the Lebesgue measure on R d , and tpo is a C°° function of compact 
support that equals 1 for (^,^+ 1 ) ^ M and |^| < 1. (As a result tp = 
咖 (1 + 16 丌 2 |《| 2 )" 2 .) 

Since the paraboloid M has a non-zero Gauss curvature, we can apply 
the restriction theorem, in particular its dual statement given in Proposi¬ 
tion 5.3, with R d+1 in place of This assertion deals with the operator 

TV(F)(x)= f e 2ni ^ Xd ^ d ^F^^ d+l )dfi 

Jm 

and then guarantees that 

ll^*(^ ? )l|L<?(E d + 1 ) ^ c ||^||L 2 (M,d/x)- 

Now let us take = /(0- Then we see that TZ*(F) = e ltA (ftpo), 

because we have set 々 +1 = t, d^i = 咖從 ， and on M we have ^+1 — 
—27r|《| 2 . As a result 

(41) ll e “ A (/)l|/^(R d + 1 ) S c ||/l|l 2 (R d )， 

A 

whenever / is supported in the unit ball. This is the essence of the result 
and from it the theorem follows easily. 

In fact, if we replace / by fs{x) = f{Sx)^ and u by us{x) = u(6x, 6 2 t) 
then, as we have seen above, (41) also holds with the same bound. How¬ 
ever (fs) A {0 = and now the support of (fs) A is the ball 

|^| < 6. So allowing 6 to be arbitrarily large shows that (41) is valid 

A 

whenever / is in L 2 and / has compact support. Since such / are dense 
in L 2 ，a simple limiting argument establishes (41) for all / E L 2 (R d ), 
proving the theorem. 

6.2 Another dispersion equation 

We now digress briefly to touch on another dispersion equation and sketch 
certain aspects that are parallel with the Schrodinger equation. 
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We have in mind the cubic equation on R x M 

du d 3 u 
dt dx 37 

with its initial value problem u(x^ 0) = /㈤. 

We can write the solution operator / h 〆( 去 ） （/)，with 

( 批 ) 3 (/)) 、。 1㈣ )7 ⑹. 

Again this operator maps S to S for each t and is unitary on L 2 (R). 

Note one difference with the Schrodinger equation: Here we can en¬ 
visage solutions u that are real-valued, which is not possible for the 
equation (34), where the solutions need to be complex-valued because of 
the coefficient 1/i. 

When t / 0, we can write 

^( 盖 ） (/) = /* for f e S ， 

where the kernel K t is given in terms of the Airy integral 

Ai{u) = ^- f e i{ ^^ uv) dv. s 

Jr 

In fact, since Kt {^)= =J R e i ( 27ri 0 3 e 27ri ^ the change of variables 
—(27r) 3 ^ 3 = v 3 /S, ^ = —t;(3t) _1 / 3 (27r) _1 shows that 

Kt{x) = (3t)~ 1 ^ 3 Ai(—x/(3t) 1 ^ 3 ). 


Now one knows that 

|Ai ( 乜 )| S c 
|Ai(-u)| < c\u\~ 1 ^ 4 

for all u. From the first of these inequalities we get the dispersion esti¬ 
mate 

|| e 0 3 (/)| 卜 <c|t|- 1 / 3 ||/|| L i. 

There is also an analog of Theorem 6.3. 



8 The convergence of this integral and the estimates stated below can be found in 
Appendix A of Book II. There these are carried out using complex analysis. The results 
needed can also be obtained by the methods in Section 2 of this chapter, and are outlined 
in Exercise 13. 




354 


Chapter 8. OSCILLATORY INTEGRALS IN FOURIER ANALYSIS 


Theorem 6.4 The solution (/) satisfies 

lkllL^(M 2 ) < c||/|| L 2 (r), with q = 8. 

The proof of this is result is parallel with that of the previous theorem 
and reduces to a restriction theorem on M 2 for the cubic curve 

r = {(6.6 ) ： 6 = -47r 2 ^}. 

According to Corollary 5.5, what is needed is an estimate for 咖⑹， 
where dfi is a smooth measure carried on the cubic curve T. The desired 
estimate can be rephrased as follows. 

Lemma 6.5 Let /(0 = f R where ^ is a C°° func¬ 

tion of compact support. Then 

/(o = o(i^r i/3 ) ? 似 w — oc. 

Proof. First note that I(^) = 0(|^2「 " 3 ). In fact 

m = f + f 

The first integral is obviously 0(|^2| -1 ^ 3 )- For the second term we use 
the second derivative test (Proposition 2.3 and Corollary 2.4) noting that 
the second derivative of the phase exceeds c|^ 2 ||^ 2 | _1 ^ 3 = 來 2 I 2 / 3 , so this 
term is also 0(|(2| _1 / 3 ), which proves that I(^) = 0(| 心 | — " 3 ). We apply 
this result when |^| > c, l€i|，where d is a suitably small constant, giving 
I(^) = 0(|4| _1 / 3 ) in this case. 

In the case when |(i| > (l/c’)|( 2 |，we apply the first derivative test 
(Proposition 2.1) noting that there the first derivative of the phase ex¬ 
ceeds a multiple of |^i|. Thus 1(^) = 0(| 《 i| _1 ) = 0(|^| -1 / 3 ). A combi¬ 
nation of these two cases yields the lemma. 

We can now invoke Corollary 5.5 with 5 = 1/3 and obtain 

II 尺 (/)lk 2 (r) S c||/|| L p( R 2 ) 

and 

II^*(^)IIl 9(R2) < c||F|| L 2 (r ), 

for p = = I ， and 1/p + 1/g = 1， so g = 8. The estimate for 7V then 

proves our theorem. 

There are also corresponding space-time estimates for solutions of the 
wave equation in terms of its initial data. See Problem 5. 
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6.3 The non-homogeneous Schrodinger equation 

We return to the imaginary-time Schrodinger equation and now consider 
the non-homogeneous problem 


(43) 


1 du 
i dt 



with F given. Here we require 


(44) u(x^0) = 0. 

It is easy to write down a formal solution to this problem by integrating 
the corresponding equation when A is replaced by a scalar. This leads 
to the solution operator 

(45) S(F)(x,t) = i f e i{t - s)A F(^s) ds. 

Jo 

Here s) indicates that for each t and s the operator e 2 ( i_5 ) A 

has been applied to F(x, s) as a function of x. The use of formula (45) 
can be justified in several different settings. The simplest is the following. 

Proposition 6.6 Suppose F is a C°° function on M. d xR of compact 
support. Then S(F) is a C°° function that satisfies (43) and (44). 

Proof. Write F = e ltA G( 、 t) with G(x, t) = i e~ lsA F(^ s) ds. Now 
F(-, s) is in the Schwartz space S(R d ) for each s and depends smoothly 
on s. Thus the same is true for G(*, s) and then for S(F)(^s)^ so this 
function is C°°. Now differentiate both sides of the identity 

e~ iiA (S(F))^t)=i f e_ isA F( 、 s)ds ， 

Jo 

with respect to t. 

The left-hand side gives e~ tiA (—<△ + 羞） S(F)(^ t). The right-hand 
side yields ie - 1 tA F[.,t). After composing with e 2tA , we see that 

as was to be proved. Note that it is obvious that S f (F)(-,0) = 0. 

The corresponding result in the L 2 setting is detailed in Exercise 14. 
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We come to the key estimate for the operator S. It arises from the 
question of proving an estimate of the form 

(46) ||5 ， (^)||L9(R d xR) ^ c in(R d xR )， 

where q = Here q is the exponent for which u G L q (R d x IR), when¬ 

ever u(x^t) = e 2tA (/)(x), with f £ L 2 . Again, a simple scaling argument 
(which we leave to the reader) shows that (46) can hold only if p 二 
the dual exponent of q. 

Theorem 6.7 The estimate (46) holds if q = and p = 

This means that S, initially defined on C°° functions F of compact sup¬ 
port, satisfies (46) with c independent of F, and hence has a unique 
extension to a bounded operator from L p (M. d x R) to L q (R. d x M) for 
which (46) is valid. 

To prove the theorem we first make two simplifications. To begin with, 
we replace the operator S by S+ given by 

S^(F)(x,t) = i f e i{t ~ s)A F(^s)ds, 

J —OO 

and next, to avoid issues of convergence, we replace S+ by S e , where 

S e (F)(x,t) = i f e i{t - s)A e- €{t - s) F(^s) ds. 

J — OO 

We will prove that 


(47) 


l|5 ， e(^)||L9(RdxR) ^ c in(R d xR) 


with c independent of e. Once (47) is established then (46) will follow 
easily. 

The advantage of S 七 (and S e ) over S is that now we are dealing with 
convolutions on the space M. d x M. For S € the kernel /C(x,t) is formally 

|x | 2 . 

( 4?r J) d/2 e_^^ e~ when t > 0, and 0 when t < 0. 

We prove (47) by the same method used in Theorem 4.1 and in the 
restriction theorem. We embed S € in an analytic family of operators, 
{T 2 }, with the complex variable ranging over the half-plane —1 < Re(z). 
The operator will be first given when d/2 — 1 < Re( 2 ) as a convolution, 
T z (f ) 二 : / * /C z ， with the locally integrable kernel 
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Here t\ — t z when t > 0 and 0 otherwise, while 7(2：) = an d the 

factor 7(2:) is bounded in any strip a < Re(z) < fe, because 叩 1 .” = 

(9( e W!og|z|) 

as \z\ —> 00 , by Stirling’s formula. We note that the Fourier 
transform of JC Z on x R (as a tempered distribution) is the function 

rOO 

= 7 ㈤ / e- i47r2t ^ 2 e~ €t e- 27Tit ^+H z dt 

Jo 

二 ie z2 (e + z(47r 2 |^| 2 + 27r^ + i)) _z_1 . 

This is because of (37) and the fact that 

e~ At t z dt = T(z + l)A~ z ~ l whenever Re(^4) > 0, 

as is seen by verifying the formula first when 乂〉 0. 

Next, if e is fixed with e > 0, then JC^ is, by the above, a bounded func¬ 
tion of (Ka+i) G x M as long as — 1 < Re(z). This Fourier multiplier 
defines T z as a bounded operator on L 2 (M. d x R) whenever —1 < Re(z), 
and gives a continuation of T z , initially defined for d/2 — 1 < Re(z). We 
also observe that /C^ is bounded independently of e when Re( 2 ) = — 1 ， 
and therefore 

(49) IHdlL 2 (R d xR ) 仝 c ll^llL 2 (R d xR) when Re(z) = — 1 ， 
with c independent of e. 

Now the kernel )C Z given by (48) is clearly a bounded function on 
x R when Re( 2 ) = d/2, with a bound independent of e. Thus 

(50) ||7 ； (F)|| l «> < c||F|| l i, when Re(z) = d/2, 
with c again independent of e. 

The interpolation theorem (Proposition 4.4) yields ||To(F)||x / <? <c||F|| L p , 
first for simple functions, and then by a passage to the limit for all F 
that are C°° of compact support. Again the bound is independent of e. 
We also recognize that 

(51) T 0 = S e 

when acting on C°° functions of compact support. 

In fact, by taking the Fourier transform in the : r-variable we see that 

S e {F) A {^ t) = i I e - i{t - s)4lv2 ^ 2 e- €{t - s) F(^ s) ds. 

J —OO 
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Then the Fourier transform in the t variable gives 

= 4 - 2(4tt 2 |^| 2 + 27T^ + i)) _1 F(^^ + i), 

which establishes (51), and hence proves (47). 

We now finish the proof by modifying F so that F(x, 5 ) = 0 when 
5 < 0. Hence from (47), when we let e 0, we get 

\ 1/9 

\S(F)(x,t)\ q dx dt) < c||F|| LP(RdxR) . 

Changing t into —t (and s into — 5 ) gives us a parallel inequality, but with 
the integration in t now taken over (— 00 ,0). Adding these two finally 
yields (46) and the theorem is proved. 



A final fact about the action of the solving operator S on the space 
L p (R d x R) is as follows. 

Proposition 6.8 If F G L p (R. d x R) then S(F) can be corrected (that is, 
redefined on a set of measure zero) so that for each t ， belongs 
to L 2 (IR d ) and ， moreover, the map t ㈠ is continuous in the 

L 2 (R d ) norm. 

This is based on the inequality 
(52) II / s) d5|| L 2( R d) < cllFl^p^x^), 

J a 

with c independent of the finite numbers a and j3. 

In fact, (52) is essentially the dual statement of (40) in Theorem 6.3. 
We let g be any element of L 2 (IR d ) with ||p||L 2 (iR d ) ^ 1- Then by the 
unitarity of e~ zsA we have 

f ( j e~ lsA F(x 1 s)g(x) dx\ ds = f ( ( F(x^s)v(x,s)dx\ ds, 

Jot \JR d ) Jot \jR d J 


where v(x, s) = (e lsA g)(x), So by (40), | … ||z^(R d xR) S c and Holder^ 
inequality gives 


、0 


e _2sA F(*, s) ds I g(x) dx 


R d 


a 
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and since g was arbitrary, this suffices to establish (52). 

Next, since S , (F)(x, t) = ie ltA e -zsA F(*, s) ds, taking a ~ 0 and f3 = 
t in (52), we see that for each t the function 5(F)(-, s) belongs to L 2 (IR rf ), 
and 

(53) sup ||S , (F)(-,t)|| L 2 ( M d) < c||F|| LP ( R d xR ). 

Finally, approximate F in the LP norm by a sequence {F n } of C°° func¬ 
tions of compact support. Then for each n, 5(F n )(-,t) is clearly contin¬ 
uous in t in the L 2 (R d ) norm. Since by (53) 

sup \\S(F)(^t)~ S(F n )(^t )\\ L 2 <c\\F- F n || LP — 0, 

t 

the continuity in t carries over to S(F)(^t) and the proposition is proved. 


6.4 A critical non-linear dispersion equation 

We now consider the non-linear problem 


(54) 


1 du 


i dt 


— Au 


a\u 


A-l 


u 


{ u(x,0) 


/ ㈤. 


Here cr is a non-zero real number and the exponent A is greater than 1. 
Besides its relative simplicity, the interest of the equation (54) is that 
its solution has two noteworthy conservation properties, namely that 
the “mass” f Rd \u\ 2 dx, and the “energy” J Rd (||Vw| 2 — ^j|w| A ) dx are 
conserved over time. (See Exercise 15.) 

We shall deal in particular with the initial-value problem for / in 
L 2 (R d ). In this setting there is a “critical” exponent A, the one for 
which the problem is scale-invariant. More precisely, suppose u is any 
solution of (54) with initial data /. Then we seek an exponent a so that 
8 a u(8x : 8 2 t) also solves the equation (54), (with initial data S a f(Sx)), for 
all S > 0. For the linear case cr = 0 of course any a will do, but in the 
present situation this requires d + 2 = Xa. Now if we also want the L 2 
norm of the initial data to be invariant under these scalings then we need 
a 二 d/2, and as a result A = 1 4 - 4/d. 

We should observe a related significant fact about the critical expo¬ 
nent A: we have q = Ap, where q and p are the dual exponents arising in 
our estimates (Theorems 6.3 and 6.7). This is the case because q = 

P - d+4 , anCl A - d • 



360 


Chapter 8. OSCILLATORY INTEGRALS IN FOURIER ANALYSIS 


Incidentally, one notices that the exact value of the coefficient a in (54) 
is not significant; what matters is its sign, since it can be replaced by 土 1 
via the fixed scaling (x, t) ^ {\a\ l ^ 2 x^ |cr|t). 

After these preliminaries we can now state the main result. Given an 
/ G L 2 (M d ), we will say that a function u in L q (R d x IR) is a strong 
solution of (54) if 

(i) u satisfies the differential equation in the sense of distributions. 

(ii) For each the function u(-, t) belongs to L 2 (R rf ), the mapping 
t ?/(•, t) is continuous in the L 2 (R d ) norm, and ^(*,0) = /. 

We can also envisage solutions u that are given only for time t with 
\t\ < a, for some fixed 0 < a < oo. In that case we assume u is in L 9 (R rf x 
< a}) and consider w as a distribution on the open set R d x {\t\ < 
a} C x IR, and define a strong solution in the same way as above. 

The theorem below guarantees the solution of our problem under two 
scenarios. First for all times if the initial data is small enough. Second, 
for all initial data /, for a finite time interval. 

Theorem 6.9 Suppose X, p and q are as above. 

(i) There is an e > 0 so that whenever \\f\\L 2 (R d ) < € then there exists 
a strong solution of (54). 

(ii) Given any f G L 2 (R. d ) 7 there is an a > 0 ， (depending on f)，so 
that (54) has a strong solution for |t| < a. 

The proof exemplifies the use of fixed-point arguments in non-linear prob¬ 
lems. 

Suppose = e 2tA (/). As will be seen, the problem reduces to find¬ 
ing u so that 

(55) u = aS (卜 I 入 _1 w) + wo. 

The existence of u is obtained by a classical iteration argument, the 
existence of a fixed point of a suitable contraction mapping A4. 

We consider first the alternative (i) of the theorem and here the map¬ 
ping A4 will be defined on the underlying space 

B = {u E L g (R d x M), with S J }， 

with S fixed below. 

The mapping M. will be given by 

M(u) = crS ， (|w| A '" 1 w) + ^o- 
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For an appropriate choice of (5, and then a choice of e implying ||/|| L 2 < e, 
we will see that 

(a) A4 maps B to itself. 

(b) \\M{u) — < \\\u — v\\lq for u,v e B. 

In fact, \\M{u)\\ Lq < |cr|||S ( 卜 | A_1 w)||z^ + \Wo\\l^ To estimate the first 
term we use Theorem 6.7, and this gives 

||S ( 卜 | A_1 w)U c ||| w | a || Lp = c| 卜 || 之 ” 

since q = So if \\u\\lq < S, then |cr|||S , (|w| A_1 w)||L9 < 5/2, as long as 
|cr|c(5 A < (5/2, which is the case if S is small enough. 

However by Theorem 6.3, ||^o||l^ < ce，since ||/||l 2 < Thus ||wo||z^ < 
(5/2, if ce < (5/2, and with this choice of e in terms of (5, property (a) is 
proved. 

Next 

\\M{u) - M{v)\\ Lq = |cr|||S (卜 | A_1 w - |^| A ~ 1 ^)||l 9 

< c|cr||| 卜 | A_1 W - |i ； | A_1 i ； ||Lp. 

However, as is easily verified 

\u\ x ~ l u — |^| A_1 ^| < C\\u — v|( 卜 I + I 叫 ) 入 _1 

for any pair of complex numbers u and v. Thus 

I 卜 | A_1 W - \v\ X ~ l v\\ L P < Ca||(w - r)(M + k|) A_ 1 ||LP- 

Disregarding the constant ca, the p th power of the term on the right is 
J \u — t?| p (|w| + I 叫 ）( A-1 ) P . We estimate this by using Holder^ inequality 
with exponents A and 入 ’ =A/(A — 1). Since \p = q and 入 ’( 入一 l)p = q 
we see that this integral is majorized by 

\ / f \'l 乂 

\u-v\ q j \^j ( 卜 | + 卜『)二 I 卜 - 叫 LI + MII》 _1)P . 
Taking p th root gives 

||«M ㈦ -X ㈦ Ikd + Mll^r 1 )， 



and we only need to choose 6 so that c^(2(5) A_1 < 1/2 to obtain (b). 
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Next define wi ， 地， … ，乜 fc, … successively according to Uk-\-\ = M(Uk), 
/c = 0,1, 2,.... Then, since G S, it follows from (a) that each Uk ^ 
B. Also by property (b) we have ||wa：+i — UkW^ < \\\ u k — ^k-iW^ and 

hence \\u k ^i - u k \\ Lq < (~) k ||^i - u 0 \\l^ 

Therefore the sequence {uk} converges in the L q norm to a w G S, and 
hence u — Ai{u) — + ^o, since Wfc+i = Ai(uk)- To see that 

w is a distribution solution of (54) we must verify that 

(56) / uC\(f) dxdt = a / \u\ x ~ 1 u(p dx dt, 

JR d xR JR d xR 

for every (p that is C°° and has compact support, with | 备 — 

However, by Proposition 6.6, 

(57) J S(F)C ; ((p) dxdt — J F 屮 dxdt 

if F is a C°° function of compact support. We now approximate an arbi¬ 
trary F in L p (R d x M) by a sequence {F n } of C°° functions of compact 
support. Since Theorem 6.7 implies that S(F n ) S(F) in the L q norm, 
the identity (57) for the F n holds also for F G L p . Thus we may ap¬ 
ply (57) to F = a\u\ x ~ l u and use Proposition 6.2 part (iii) to conclude 
that (56) is valid, because u = S(F) + wq. 

Next, applying Proposition 6.8 to F = a\u\ x ~ 1 u shows that for each 
the function u(-, t) belongs to L 2 (R d ) and 1 1 —> is continuous in the 

L 2 norm. Obviously w(. ， 0) = /(•) so the proof that w is a strong solution 
is complete. 

In the second alternative, where we do not assume ||/|| < 6, we instead 
choose a positive constant a so that 

\ i/9 

e ltA (f)(x,t)\ q dxdt J < 6/2. 
x{|t[<a} / 

Such a choice of a, which depends on /, is possible since e ltA f G L q (R d x 
M). We then proceed as in the previous alternative with the understand¬ 
ing that now B consists of functions on R d x {\t\ < a} (of norm < (5). 
Note that S(F)(^t), for \t\ < a, depends only on F(-, 5 ) for |5| < a, so 
all inequalities used are still valid in this context, and the proof can be 
carried out as before. 

The uniqueness of the solution of (54) and its continuous dependence 
on its initial data is outlined in Exercise 17. 
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7 A look back at the Radon transform 

We now link the averaging operator studied in Section 4 with the Radon 
transform, pointing out certain striking affinities between these two, and 
formulating a common generalization. 

Some elementary properties of the Radon transform were set down 
in Book I, where one can find an indication of its early interest. Of 
further significance is its role in the theory of Besicovitch-Kakeya sets. 
There, an L 2 smoothness property for d > 3, somewhat akin to that 
of averaging operators, is responsible for the continuity of measures of 
hyperplane sections asserted in Chapter 7 of Book III. Moreover the 
existence of Besicovitch sets may be said to be possible because when 
d = 2 the smoothing in L 2 is exactly of critical order 1/2; in addition, 
this property of the Radon transform allowed one to see that Besicovitch 
sets in d = 2, must have HausdorfF dimension 2. 


7.1 A variant of the Radon transform 

Recall that in R d the Radon transform 1Z is defined by 

nm^)= ( f 


where (t, 7 ) G R x S d ~ x and Vt^ is the affine hyperplane {x : x • 7 = t}. 

The smoothing property of 1Z we have in mind can be stated most 
easily when d = 3 as the identity 


(58) 



盖尺⑺ (，，7) 


dtda(x) = 87 r 2 f \f(x)\ 2 dx. 

Jr 3 


This is a direct consequence of the observation that 尺 (/)( 入， 7 ) 二 /(A7), 
with 7 ^(/)(A, 7 ) denoting the Fourier transform in t of 7 ) (with 

八 

dual variable A), and / denoting the usual 3-dimensional Fourier trans¬ 
form of /. 

To pursue this point a little further we consider briefly a simple u lin- 
earized” variant of the Radon transform that, unlike 1Z, is directly given 
as a mapping of functions of to functions of This variant is deter¬ 
mined once one fixes a non-degenerate bilinear form B on M rf_1 x 
and is denoted by 

Kj)(x) = f B(x’ ， y’) ） dy 、 
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where we have set x = (x 7 , Xd) G R d ~ l x M and y = (y\ yd) G R d ~ l x M. 
So can be written as 


尺 B(f)(X)= /， 

Jm x 

with M x denoting the affine hyperplane {(y\yd) - Vd = ~ y f )}. 

The integration measure on M x is taken to be dy 、the Lebesgue measure 
on 

Note that the mapping x M x is an injective mapping from R d to the 
set of affine hyperplanes on and this mapping is surjective on the col¬ 
lection of hyperplanes that are not perpendicular to the hyperplane Mo. 
Since the excepted collection of hyperplanes is a lower-dimensional sub¬ 
set, then, broadly speaking, TZb can be thought of as a substitute for TZ. 

Now let us revert to the simplest case, d 二 3, where an analog of (58) 
is 


(59) 


R 3 


d 


dx 3 




2 


dx = c B I \f{x )\ 2 dx, 


R 3 


which we prove when / is (say) a smooth function with compact support. 
To see (59) consider the Fourier transform in the : ^-variable (with (3 

A 

its dual variable), that is, 7 ^b(/)(x / ,^ 3 ) is given by 



e -2 吨 B ( X ’， y ’ W ， u dy 、 


where here / denotes the Fourier transform in the X 3 -variable. Sim¬ 
ilarly, ( 盖 〜⑺） （ x ’ 乂 3 ) (also taking the Fourier transform in the 
X 3 -variable) is given by 


27t^ 3 / e - 27 r ^ 3 B(x ， , 2 / ， ) /(y / , ^ 3 ) dy f . 
Jm 2 


However, B(x f , y f ) = C{x f ) - y f for some invertible linear transformation C 
on M 2 . Therefore, introducing the new variable ^C(x ; ) = with w G M 2 , 
we have &B[x 、 y’) = u • y’ and | det(C)| dx f = du. So an application 
of Plancherel’s theorem in R 2 leads to 






2 

dx f = 


4tt 2 f 
det(C)| J U 2 


|/(以 3 )| 2 办 , . 


Hence, integrating in ^3 and applying PlanchereFs theorem again, but 
this time in the : ^-variable, yields (59). 
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If we consider an appropriately localized version VJ B of TZb^ then using 
the above it is easy to see that 

II^b(/)IIl2(R3) < c||/|| L 2 (]R 3). 

Corresponding results for general d, when d is odd, giving L 2 smoothing 
of order (d — 1)/2 can be obtained in the same way. The steps leading 
to these conclusions are outlined in Exercises 18 and 19. 


7.2 Rotational curvature 


We have learned from the above considerations that there seems to be a 
parallel between the averaging operator A and the Radon transform TZb 
in terms of their smoothing properties. Each of these operators is of the 
form 



/ f(y) dfi x (y), 

Jm x 


where for each x E we have a manifold M x (that depends smoothly 
on x) over which we integrate. In the case of A it is M x = x + M, and in 
the cas6 of TIb it is M x = {y = (y f ,Xd - B(x\y f )), y f E R d_1 }. However, 
paradoxically, the key feature of A was the curvature of M, while in the 
case of TZb the corresponding manifolds M x are hyperplanes and have 
no curvature. So how are we to see them as different manifestations 
of the same phenomenon? Another issue is the question of having a 
diffeomorphic-invariant formulation for the conclusions regarding these 
operators. This question arises naturally, because the spaces L 2 , L v 、 and 
are (at least locally) invariant under diffeomorphism. 

What unifies the above examples is a common rotational curvature 
that takes into account not only the (possible) curvature of each fixed 
M x , but how the M x evolve (or “rotate”）as r varies. This concept can 
be formulated as follows. 

We start with a C°° function p = p(x, y) given on a ball in M. d x M. d (a 
“double” defining function), and assign to it its rotational matrix 
defined as the (d + 1) x (d + 1) matrix given by 

dp 
dyd 


d 2 p 

dyjdx k 


M 


P 

dp 

dxi 


dyi 
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We define the rotational curvature of p, denoted by rotcurv(p), as the 
determinant of the matrix A4, 

rotcurv(p) det(«M). 

Our basic condition is that where p = 0, then rotcurv(p) ^ 0. This 
clearly implies V y p(x, y) # 0 there. Hence if M x = {y : p(x, y) — 0}, each 
M x is a C°° hypersurface in that in fact depends smoothly on x. 
We then note the following properties of rotational curvature that are 
straight-forward to verify. 

1 . If p(x, y) = p(x — y), the translation-invariant case, then M x = x -\- 
Mq. Here one also has the condition that rotcurv(p) # 0 is equivalent 
with the non-vanishing of the Gauss curvature of Mo. 

2. In the case of TZb^ we take p(x, y) = yd 一 x d + B(x ; ^ y f )^ and then 
rotcurv(p) _ 0 is equivalent to B being non-degenerate. 

3. If p ; (x, y) = a(x, y)p(x, y), with a(x, y) ^ 0, then p l is another defining 
function for {M x } and rotcurv(p’）= a rf+1 rotcurv(p) whenever p = 0. 

4. The invariance of rotational curvature under local diffeomorphisms 

can be stated this way: Suppose x ^ ^i(x) and y ^ 屯 2 ( 2 /) are a pair of 
(local) diffeomorphisms on and set p ; (x, y) = ^ 2 ( 2 /))- Then 

rotcurv(〆）=j7i(x)j72(t/)rotcurv(p) whenever p ; (x, y) — 0, where J\ and 
J 2 are the Jacobian determinants of 屮 1 and 屮 2 respectively. 

With these notions in hand we can come to the regularity theorem for 
the general form of the Radon transform. 

We assume we are given a double defining function p as above with 
rotcurv(p) / 0. We set M x = {y : p(x,y) = 0}. For each x we let 
da x (y) be the induced Lebesgue measure on M x , and define dp x (y)= 
-0o(x, y)da x (y)^ where -0o is some fixed C°° function on x W d of com¬ 
pact support. Given this, we define the general averaging operator A 
by 

(60) Af)( x ) = f f(y)d^x(y), 

Jm x 

initially for functions / on that are (say) continuous with compact 
support. 

Theorem 7.1 The operator A extends to a bounded linear map o/L 2 (R rf ) 
to with k = • 
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It should be pointed out that the averaging operator A of Section 4 is 
translation-invariant, and the Radon transform TZb is partially so; it is 
translation-invariant with respect to the X 3 -variable. So in both cases 
the Fourier transform can be used. However in the general situation the 
Fourier transform is unavailable and we must proceed differently. 

There will be two steps. The first will use an oscillatory integral op¬ 
erator that partly substitutes for the Fourier transform and Plancherel’s 
theorem. The second is an L 2 estimate, obtained via a dyadic decom¬ 
position of “almost-orthogonal” parts, that further serves to implement 
this approach. 


7.3 Oscillatory integrals 

We turn to the first idea. We consider an operator T\ (depending on a 
positive parameter A) of the form 


T a (/)(x) = / e lX ^ {x ^ y) ^(x,y)f(y) dy. 


R d 


Here $ and ^ are a pair of C°° functions on x the latter assumed 
to have compact support. The phase $ is supposed to be real-valued, 
and the key assumption is that its mixed Hessian 


(61) 


det{V^ y $} = det{ 


d 2 ^ 


dx k dy 3 


l<k,j<d 


is non-vanishing on the support of -0. 

Proposition 7.2 Under the above assumptions we have ||7\ \<c\~ d !\ 
入 > 0 ， with || - || denoting the norm of the operator acting on L 2 (M. d ). 

For us the importance of this proposition is the consequence it has for 
a corresponding oscillatory integral that involves the defining function p. 
We set 


(62) 


Sx(f)(x) 


e iXy QP [x ， y) 机 y)f(y)dy Q dy- 


RxR d 


Here the integration is over (yo ， y) G M x M. d . The function -0 is again a 
C°° function with compact support in all variables, but the noteworthy 
further assumption is that -0 is supported away from t/o = 0. 

Corollary 7.3 Assume that the double defining function p satisfies the 
condition rotcurv(p) ^ 0 on the set where p = 0. Then 


||5a|| < cA- 


d±l 
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Note. We have an extra gain of A -1 / 2 over what can be said for T\. 

The proof of the proposition is in many ways like that of the scalar 
version, Proposition 2.5 in Section 2, so we will be brief. As before, we 
begin by taking the precaution that ^ is supported in a small ball. Now 
if T is an operator on L 2 , then ||T*T|| = ||T|| 2 , where T* denotes the 
adjoint of T. 9 

However T\ is given by the kernel K(x^ y ) : = e lX ^^ x,y ^ , ip(x, y), that is, 
T x (f)(x) = f K(x,y)f(y)dy, so is given by the kernel K(x,y) and 
T^T\ is given by the kernel 

M(x^y) = f K(z ， x)K(z,y) dz = j* y, z) dz^ 

JR d JR d 

with z) = 2 p(z^ x)i/^(z, y). The crucial point will be like (14), 

namely, 

^ cat(A|x — y|)—w for every > 0. 

Here, with 2 = (zi,..., G we use the vector field 

r 1 A d ^ 

L = rx2^ a ^^ a ' Wz 

3 = 1 3 


and its transpose, L l {f) — 





where 


( a j ) 二 a = 


V z (^(z,x) - 
V z (^(z,x) - ¥(z~y))\^ 


Now because u = x — y is sufficiently small in view of the support as¬ 
sumptions made on -0, we see as before that \a\ ^ \x — y\~ l and \d^a\ < 
\x — y\~ l ^ for all a. Thus 


\M(x,y)\< J L n z ) dz 

< J \{L l ) N ^{x,y,z)\dz 

< cat(A|x - y\)~ N . 


In this connection, see for instance Exercise 19, in Chapter 4 of Book III. 
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However, then 

|T A *T A /(x)| < I \M(x,y)\\f(y)\dy 
< j M°(x-y)\f(y)\dy 
=j M°(y)\f(x-y)\dy, 

where M°(u) — c f N (l + A|w|) _iV , and by Minkowski’s inequality 

|| r A * T A (/)|| L2 < ||/|| L2 J M°(u)du. 

However J M°(u) du = cA _d , if in the estimate for M° the N is taken 
to be greater than d. As a result ||T^Ta|| < cX_ d , and the proposition is 
proved. 


We turn now to the corollary. The link between the rotational curva¬ 
ture of p and the phase $ in the proposition occurs in passing from 
to R rf+1 . With x = (xo, x) G M x and y = (yo, y) G M x = 

we set 

= x 0 y 0 p(x,y). 


Then, as is evident, 

= (x 0 y 0 ) d+1 rotcurv(p). 


Now define F\(xq, x) by 
(63) Fx(x 0 ,x) = F x (x)= 


I e tX ^ x ^^i(x 0 ,x,yo, y)f(y) dy 0 dy 

Rd+l 

f e iXxoyop ^ x ^ y) ^i(xo,x,yo, y)f(y) dy 0 dy 


with x, yo^ y) — yo, y), and 如 having compact support that is 
disjoint from Xo = 0 or t/o = 0- 

This means that S\(f)(x) = F\(l, x). 

To proceed we need the following little calculus lemma, valid for any 
function g which is of class C l in an interval I of length one. Suppose 
G /, then 


(64) 


\g(u 0 )\ 2 <2(/ \g{u)\ 2 duI \g\u)\ 2 du 


(/ 


2 


|2 
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Indeed, for any u e I ， one has g{uo) = g(u) -h 厂 0 g ， [r) dr. So by Schwarz’s 
inequality • 

|p(^o )| 2 < 2 (I 咖 )| 2 + / Ip’WI 2 ， 


and an integration in u ranging over / then yields (64). 

We apply this inequality with / = [1, 2], Uo = 1 and g(u) = F\(u, x) 
(that is, w is the variable x。). Since Fa(1,x) = S\(f)(x)^ we therefore 
have after an integration in x G 

[ |5\(/) ㈤ I 2 dx < 2 ( f \F x (x 0 ,x)\ 2 dx 0 dx 
JR d \JRxR d 



The first term on the right-hand side of the inequality is dominated by a 
multiple of 入 _ ( d+1 ) f Rd \ f(y)\ 2 dy^ as we see by applying the proposition 
(with in place of IR rf ), since 咖 has compact support in yo. 

However the second term is more problematic, because differentiation 
in Xq in (63) brings down an extra factor of A. We get around this by 
observing that 


r \ 

° ^ e i^x 0 yop(x,y)^ 


d 

dyo 


/ e iXx 0 y 0 p(x,y)\ ^0 
^ J Xq 


and then integrating by parts in the yo variable in (63). We note that 
because of the support property of -0i, the variable yo is bounded away 
from 0, and the differentiation in yo falls only on the smooth functions 
of the integrand, and not f{y) since it is independent of yo. 

This shows that the second term also satisfies the desired estimate, 
establishing the corollary. 


7.4 Dyadic decomposition 


We now come to the dyadic decomposition of the operators A. When we 
fix any Schwartz function h on R that is normalized by f R h(p) dp = 1^ 
then we know (see Exercise 8) that for any smooth hypersurface M in M. d 
with defining function p, and any continuous function / on R d of compact 
support, 


lim 6 
€—0 



h(p(x)/e) f (x) dx 



f da 
； |Vp| , 


with da the induced Lebesgue measure on M. 
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As a result (see ( 60 )) 


= f f(y)dfi x (y) = lime -1 [ h 2 p(x ， y)f(y) dy 

Jm x e —◦ JRd \ e J 

where is a C°° function of compact support given by i/ ； (x, y) = 

jpo(x,y)\V y p\ and d^i x (y)= 械 x ， y 、 da x {y). 

Now choose ^(u) to be a C°° function on R with 7 supported in \u\ < 
1 , and = 1 if \u\ < 1 / 2 , and let h(p) = f R e 27rzup j(u) du. Then by 
the Fourier inversion theorem f R h(p) dp = and also f e 27rtup j(eu) du = 
e~ l h{p/e). 

Next write e = 2— r , with r a positive integer, and ^y(2~ r u) = ^y(u) + 
Z^:=i( 7 ( 2 _fc 以 ） _ 7 ( 2 — fc_1 w)). Letting r 00 we have 


k—1 

with rj(u) = 7(w) — 7(^/2), and rj is supported in 1/2 < \u\ < 2. 

As a result of the above, we can write, whenever / is continuous, 

OO T 

4 (/) ㈤ = yZ^k(f)(x) = lim Y]A k (f)(x), 

《 ^ T — >00 / ^ 

k=0 k=0 


where 

(65) A k (f)(x) = j e 27viup< ^ x ' y) r)(2~ k u)i ； (x,y)f(y) du dy 

jRxR d 

(with a similar formula for Ao^ but with rj(2~ k u) replaced by ^y(u)). The 
limit here exists for each x. 

We now make the following observations about the operators Ak(f)^ 
the first of which is self-evident. 

⑷ A k (f) is a C°° function of compact support for each / G L 2 (IR rf ). 
(b) We have the estimates 

(66) \\A k (f )\\ L 2 <c2- fc (^)||/|| L2 . 

In fact, if we make the change of variables 2~ k u = yo, then 

(67) A k (f){x) = 2 k f e 27Ti2kyop{x ^ y) 2 p(x, y 0 , y)f(y)dy 0 dy, 

jRxR d 
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with ^(x, Xo, y) = -0(x, y)rj(yo)^ which, in light of (62), equals 

2 %(/)(x )， 

where A = 2n2 k . Thus the inequality (66) is an immediate consequence 
of Corollary 7.3, since rj is supported away from zero. 

(c) We have the following strong “almost-orthogonality” of the collection 
{ 人 }: there is an integer m > 0 so that whenever \k — j\ > m, 

(68) \\A k A){f)\\ L ，< CN 2- 7Vmax ^)||/|| L2 , 

for each AT > 0. A similar assertion holds for A^Aj. 

To verify (c) we make a simple estimate of the size of the kernel of the 
operator AkA*. A straight-forward calculation yields that its kernel is 
given by 

(69) 

K(X,y) 二 2 k V f e 2ni(^vp(z,y)-2 k up(z,x))^ z ^ y)^)^) dz du dv 

jRxRxR d 

where x, y) = ip(z^ x)^(z, y). Now assume j > k (the case k > j is 
similar). Write the exponent in (69) as 

2 / ni{2^vp{z^y) — 2 k up(z^ x)) = zA$(z), 

with A = 27r2 J and $(z) = vp(z^ y) — 2 k ~^up{z^ x). Recall that because 
of the support properties of r) we have 1/2 < |^| < 2 and 1/2 < \u\ < 2. 
As a result |V 2 $(z)| > c 7 > oifj-fc > m, for some fixed m that is large 
enough, (because |V z p(z, y)\ > c, while \V z p(z^x)\ < 1/c for a constant 
c that is small enough). 

We now can invoke Proposition 2.1 to estimate J Rd x, y) dz, 

and as a result obtain that for each N > 0 

\K(x,y)\<c N 2 k 2^2^ N 

< c N ^2 - with N f = N -2. 

Since K also has fixed compact support, the estimate (68) for AkA* is 
therefore established. Of course a parallel argument works for A^Aj, 
and property (c) is proved. 

(d) Our last assertion concerns the operators (^) a Ak — d^Ak, which 
we denote by . Note that like Ak, has a kernel that is C°° and 
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has compact support. The {A^ } satisfy estimates very similar to those 
for {Ak}- In fact, 

(70) ||^ Q) || < c Q 2 fc|a| 2 _fc (^ i ), 
and 

(71) ||4 a) (4 Q) )*" S c W 2_iVmaX(fcj) ， ^\k-j\>m. 

There is a parallel estimate for (A^yA^ a \ Here of course || • || denotes 
the operator norm on L 2 (R d ). 

Looking at (67) we see that carrying out the differentiation on 
Ak(f) yields a finite sum of terms like Ak (but with modified fs) mul¬ 
tiplied by factors that do not exceed 2 fc l a L Thus (70) and (71) are direct 
consequences of assertions (b) and (c) above. 

7.5 Almost-orthogonal sums 

Since we have appropriate control of the norms of the different pieces Ak 
making up we now put these together by using a general almost- 
orthogonality principle. 

We consider a sequence {Tk} of bounded operators on L 2 (R d ) and we 
assume we are given positive constants a(/c), with —oo < k < oo, so that 
the sum is finite，that is, 乂二 a(k) < oo. 

Proposition 7.4 Assume that 

\\T k T ； \\<a 2 (k-j) and \\^\\ < a\k - j). 

Then for every r ， 

r 

( 72 ) \\J 2 n\\<A. 

k=0 

The thrust of this proposition is of course that the bound A is indepen¬ 
dent of r. 

Proof. We write T = an d recall that ||T|| 2 = ||TT*||. Since 

TT* is self-adjoint we may use this identity repeatedly to obtain ||T|| 2n = 
||(TT*) n ||, (at least when n is of the form n = 2 S for some integer 5 ). Now 

(ttt= w 2 ...h 

^ 1 )^2 > • • • ，： 2 n 
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We make two estimates for the norm of each term in the above sum. 
First 

IIU/ 2 … ^2n-l^* 2n H - _ - “)…- ^2n), 

which is obtained by associating the product as (T^T^*) - - - (T^ 2n _ 1 T-* ri ). 
Next 

117^1*2 ...< A 2 a 2 (i 2 - i 3 )a 2 (i 4 - 4). •. a 2 (z 2n _ 2 - z 2 n-i), 

which is obtained by associating the product as (T* 2 T{ 3 ) - - - 
(T^ 2n _ 2 Ti 2ri 一 ， and using the fact that and T-* n are both bounded 
by A. Taking the geometric mean of these estimates yields 

llU/ 2 … ^2n-i^22nll - ~ ^)^2 - 《 3) … ^2n~l - ^2n)- 

Now we sum this first in then and so on, until i 2 n-\-> obtaining 
a further factor of A each time, because A = ^ a(k). When we sum 
in Z 2 n we use the fact that there are r + 1 terms in the sum. The result 
is then ||T|| 2n < A 2n (r + 1). Taking the (2n) th root and letting n —> oc 
gives (72) and the proposition. 


7.6 Proof of Theorem 7.1 

We consider first the case when the dimension d is odd, and thus the 
fraction (d — 1)/2 is integral. The case when d is even is slightly more 
complicated, and will be dealt with separately. 

In this first case we must show that whenever |a| < (d — 1)/2, and 
/ G L 2 (M d ), the derivative d^A(f) exists in the sense of distributions, is 
an L 2 function, and the mapping / i—> d^A(f) is bounded on 1?. 

For each r we consider 

r r 

d xJ2 Ak = J2 Tk ^ where T k = A { k a) = d^A k . 

k=0 k=0 

Now because of (70) and (71) we see that the hypotheses in Proposi¬ 
tion 7.4 are satisfied with in fact a(k) = C]\[2~^ k ^ N ^ (and in particular for 
N = 1). Thus 

r 

(73) \\d^A k (f)\\ L ^ < A||/|| l2 , |a| < 
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However, by (70) for a = 0, the sum Ak converges in the L 2 norm 

as r —oo to (since the latter also converges pointwise to A(f)), 

and hence in the sense of distributions. Thus ^(/) also con¬ 

verges in the sense of distributions as r —^ oo but since this sum is uni¬ 
formly in L 2 as r varies, the limit is also in L 2 . 

Finally then we have 


||^(/)|| l 2 <A||/|| l2 , 

whenever / is continuous and of compact support and \a\ < (d — 1)/2, 
with d odd. Hence Theorem 7.1 is proved in this case. 

Now we consider the case when d is even. Here we need to involve the 
“fractional derivative” operator D s , defined on the Schwartz space S by 
its action as a multiplier on the Fourier transform, namely 

(DV) A (0 = (i + l^l 2 ) 5/2 /(0- 

Note that ||-D 5 (/)||l 2 = ll/lkj, where a = Re(s), whenever / is in 5. We 
also need to observe that if Re(5) = m is a positive integer, then 

(74) ||^ 5 (/)||l 2 <c ||^/|| l2 . 

|a|<m 

Indeed, this follows directly from the inequality (14 - |^| 2 ) m / 2 < 

o! Yl\a\<m Id, € G 狀 ' an d PlancherePs theorem. 

Now arguing as above for the case when d is odd, it will suffice to prove 
that 

r 

(75) p^f i ^ v A fc (/)|| L2 <c||/|| L2 , 

k=0 

with the bound c independent of r. To this end, consider the family of 
operators T 5 , depending on the complex parameter 5, defined by 

r 

(76) T S U) = D s ^Y1 2 七人⑺， 

k=0 


for / G L 2 (M d ) (in particular for simple /)• As we have already noted, for 
such / the Ak{f) are in S so (76) is well-defined and T s (f) is itself in S. 
Moreover, whenever p G L 2 , (in particular, if it is a simple function) then 
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by Plancherel’s theorem 

$( 5 ) = [ T s (f)gdx 
jR d 

= J2 2 ~ ks f (i + iei 2 ) 5/2 ^(Op(-o^, 

where each Fjt belongs to S. Hence $ is analytic (in fact entire) in 5 , 
and by Schwarz’s inequality, bounded in any strip a < Re(5) < b. 

Next 

(77) sup||T-H^(/)|| L2 < M ||/|| l2 . 

t 

In fact, by (74) and (76) it suffices to see that 


\^ 22 - k ^A k (f)\\ L 2 <M\\f\\ L ^ for |a| < 
fc =0 


But this is proved like (73) by using estimates (70) and (71) for = 
{^c) a together with the almost-orthogonality proposition in Sec¬ 
tion 7.5. 

Similarly, one shows that 

(78) sup ||T ^ + ^(/)|| L 2 < M||/|| L 2 . 

t 

Finally, we apply the analytic interpolation given by Proposition 4.4. 
Here the strip is a < Ke(s) < 6, with a = —1/2, 6=1/2 and c 二 0, while 
p 0 = q 0 = pi = qi = 2. The result is then 

||T°(/)|| l2 <M||/|| L2 , 

which in view of the definition (76) is the estimate (75). This completes 
the proof of the theorem. 

Remark. The L p , L q boundedness result of Theorem 4.1 (b) and Corol¬ 
lary 4.2 extends to this setting. The proof is outlined in Exercise 20. 

8 Counting lattice points 

In this last section we will see the relevance of oscillatory integrals to 
some questions related to number theory. 
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8.1 Averages of arithmetic functions 

The arithmetic functions we have in mind are 厂 2 ⑻， the number of rep¬ 
resentations of A: as the sum of two squares, and d{k)^ the number of 
divisors of k. Even a cursory examination of the size of these functions 
as A: —> 00 reveals a high degree of irregularity, so that it is not possible 
to capture by simple analytic expressions the essential behavior of these 
functions for large k. 

In fact, it is an elementary observation that r 2 ⑻ = 0 and d(k)= 
2, each for infinitely many A:, while given any A > 0, one has r 2 (k) > 
(\ogk) A for infinitely many k y and the same is true for d(k). 10 

In this context an inspired idea was to inquire instead as to the average 
behavior of these arithmetic functions. That this might be a fruitful 
question is already indicated by the observation of Gauss: the average 
value of r 2 (k) is n. This means that ^ J2k=i r 2 (k) 一 兀 ， as " 一 00 . 

In more detail, we have the following result. 

Proposition 8.1 Yhk=i r 2(&) = tt" + as ji — 00 . 

The proof depends on the realization that Ylk=o r 2(^) represents the 
number of lattice points in the disc of radius R with R 2 = fi. In fact, 
with Z 2 denoting the lattice points in M 2 , that is, the points in M 2 with 
integral coordinates, then r 2 (k) = #{(^ 1 , ^ 2 ) ^ Z 2 : k = nf n^}, and 
hence 

^2r 2 (k) = ^{(m,n 2 ) E Z 2 : n\+n\< R 2 }. 

k=0 

So if N(R) is the quantity above, then the proposition is equivalent to 

(79) N{R ) 二 ttR 2 + 0(i?), as i? — oo. 

To prove this we write Dfi for the closed disc {x G M 2 : |x| < i?}, and let 
Dfi be the rectangular region that is the union of unit squares centered 
at points n G Z 2 with n G Dr, that is, 

Dr = 。 (S + n), 

|n|<H ， n€Z 2 

with S = {x — (xi,X 2 ) : —1/2 <Xi< 1/2 ， i = 1,2}. 

10 For the elementary facts about 7 * 2 (fc) and d(k) stated here, including the asymptotic 
formula (81), see, for example, Chapter 8 in Book I and Chapter 10 in Book II. 
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Dr 


Since the squares S N are mutually disjoint and each has area 1, we 
see that m{Da) — N(R). However 

(80) Dr_ 2 — i/2 C C ^h+ 2- 1 / 2 - 

In fact if x G S + n with \n\ < i?, then \x\ < 2 -1 / 2 + |n| < i? 4 - 2 -1 / 2 , so 
Dr C D^ +2 - 1 / 2 - The reverse inclusion can be proved the same way. It 
follows from (80) that 

m(D R _ 2-1/2) S m(Dfi) < m(D R+2 一 1 / 2 )， 

and hence 

tt(R- 2~ 1/2 ) 2 < N{R) < 7r(i? + 2 — 1/2 ) 2 , 
proving that N(R) = ttR 2 4 - O(R). 

There is a similar but somewhat more intricate statement for the av¬ 
erages of the divisor function. Dirichlet’s theorem asserts: 

(81) ^2d(k) = "logM + (27 — 1)M + 0(/x 1/2 ) as " 一 oo, 
k=l 

where 7 is Euler’s constant. 
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Again this is a consequence of counting lattice points in the plane: 
the left-hand side of (81) is the number of lattice points (ni ， ri 2 )，with 
ni,n 2 > 0 that lie on or below the hyperbola X\X 2 = "- 11 

Both (79) and (81) raise the question of what are the true sizes of 
the error terms appearing in these asymptotic statements. Like other 
important questions of this kind in number theory, these problems have 
a long history involving much effort, but yet remain unsolved. It will be 
our purpose here to show only how the first results that go beyond (79) 
and (81) can be obtained by the help of ideas treated in this chapter. 

8.2 Poisson summation formula 

Indispensable for any further insight into these problems is the applica¬ 
tion of the Poisson summation formula. We state this identity here in 
the general context of but with a restricted hypothesis sufficient for 
our applications. 12 

Proposition 8.2 Suppose f belongs to the Schwartz space Then 

(82) X^( n ) = E /( n ). 

nez d nGZ d 

Here Z d denotes the collection of lattice points in R d , the points with 

A 

integral coordinates, and / is the Fourier transform of /. 

For the proof consider two sums 

f(x + n) and ^ /(n)e 2 —. 

nGZ d nGZ d 


Both are rapidly converging series (since / and / are in <S(IR d )), and hence 
both these sums are continuous functions. Moreover each is periodic, 
that is, each is unchanged when x is replaced by x + m, for any m G Z d . 
For the sum ^ neZ d /(x -h n) this is clear, because replacing x by x -\-m 
merely reshuffles the sum. Also the second sum is unchanged, because 
of the periodicity of e 27rm ' x for each n G Z d . Moreover both sums have 
the same Fourier coefficients. To see this, let Q be the fundamental cube 


11 That there might be some connection between the averages r* 2 (fc) and d(k) is sug¬ 
gested by the fact that r 2 {k) = 4(di(fc) — ds(k)), for fc > 1, where d\ and c /3 are respec¬ 
tively the number of divisors of fc 三 1 mod 4 or 三 3 mod 4. 

12 Other settings for the formula can be found in Chapter 5 in Book I, and Chapter 4 
in Book II. 
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Q = {r G : 0 < Xj < 1, j = 1,..., d}, and fix any m G Then 


乂 fe /( … )) 


e 


— 2nim-x 


dx 


E / f^ e 

n 


-2'Kim-x 


dx 



/ ㈤ e 


— 27rimx 


dx 


=/M ， 


since (J nGZd (Q + n) is a partition of into cubes {Q + n} nGZ d. More¬ 
over 

X(? 

because e 2mn-x e -27rtm-x dx = 1 if n = 771 , and is 0 otherwise. Since 

f(x + n) = f(n)e 27rinx have the same Fourier coefficients，these 
functions must be equal, 13 and setting x = 0 gives us (82). 


/ ⑻ e 


27rm-x 


e 


-2'Kim-x 


dx 二 f(m), 


Next let us see what happens to the summation formula (82) when we 
apply it first to the case of a radial function on R 2 ， f(x) = /o(|x|), that 
is in 5, and then we try it with xh, the characteristic function of the 
disc Dr. 

Using the formula (4) in Section 1 we obtain 
(83) ^2 fo(\ n \) = 2rr j fo(r)rdr -f ^ F 0 (k 1/2 )r 2 (k), 

n€Z 2 k=l 

once we gather together the terms for which \n\ 2 = k. Here Fq(p)= 
2tt J 0 °° Jo(27rpr)/o(r)r dr, and we note that Jo(0) = 1. 

If we could apply this formula to the case when / is xh, (the obstacle 
is that of course \R ls not smooth), and use the fact that rJ\{r)= 
J Q r crJo(^) d(7, which is outlined in Exercise 23, this would give us Hardy’s 
identity 

N(R) = ttR 2 + 

k=l 

Note that since J\{u) is of order u~ x ! 2 as w oo (see (11)), the series 
does not converge absolutely, and this is the barrier in trying to ap¬ 
ply (83), even if one is guaranteed the (conditional) convergence of the 


13 See, for instance, Exercise 16 in Chapter 6 of Book III. 
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series. Nevertheless, since each term of the series is 0(i? -1 / 2 ), it might be 
hoped that the error term, N(R) — 7ri? 2 , is roughly of the order 0(i?" 2 ), 
and this is what is conjectured. 14 

Here we prove the following weaker assertion that is, however, an im¬ 
provement over (79). 

Theorem 8.3 N(R) = ttR 2 -f 0(i ? 2 ’ 3 )， as R — oo. 

Proof. We replace the characteristic function xr by a regularized 
version as follows. We fix a non-negative “bump” function ip that is C°°, 
is supported in the unit disc, and has f R2 (p(x) dx = 1. We set (fs(x)= 
8~ 2 (p(x/8)^ and let 

Xr,s = Xr * 


Then clearly xr,s is a C°° function of compact support and hence the 
summation formula (82) applies to it. Notice that 
and Xh,5(0) = Xh(0)^(0) = ttR 2 . 

As a result, if we define Ns(R) = Ylnez 2 XH ， <5( n )，then the summation 
formula yields 

N s (R) = 7rR 2 ^J2^ n ^ 6n ^ 

77,7^0 

We now estimate the sum above by breaking it into two parts as 

E + E • 

0<|n(<l/<5 \n\>l/8 


For the first sum we use the fact that 


R 


IxhWI = |^||Ji(27r|n|i?)| = 0(i? 1/2 |n「 3/2 ), 
by what has been said above, and that |(,2>(n(5)| = 0(1). This gives 


= 0 Rl/2 i n i _3/2 

0<\n\<l/S \ 0<\n\<l/S 

\x\~ 3/2 dx 



14 More precisely the guess is that the error term is 0(/? 1 / 2+e ) for every e > 0. See also 
Problem 6. 
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Similarly, 



since \ip(n8)\ = (In fact is rapidly decreasing.) Thus 

this sum is also We conclude therefore that 

(84) N S {R) = nR 2 4 - 0(R 1/2 8~ 1/2 ). 


However there is a simple relation between Ns(R) and namely 


(85) N S (R -(5) < N(R) < N S (R^8). 

This in turn follows from the observation that 


Xr-s,s <Xr^ Xr+s,s- 

The inequality on the right-hand side, Xr(^) < / XR+s(x - y)^s(y) dy, 
is clear because x G Dr and \y\ < S implies x — y E Similarly for 

the inequality on the left-hand side. 

Finally by (84), we have N§(R S) = nR 2 4 - 0(R 1 ^ 2 S~ 1 ^ 2 ) 4 - O(RS) 
and analogously Ns(R — S) = ttR 2 4 - 0(R 1 ^ 2 S~ 1 ^ 2 ) 4 - O(RS). Altogether 
then, (85) yields that 

N(R) = ttR 2 + 0(R 1/2 8~ 1/2 ) + 0(R8). 

By choosing 8 = i? -1 / 3 we make both O terms above equal, and this 
gives 

N(R) =7ri? 2 + 0(i? 2/3 ). 

The theorem is therefore proved. 

The approach to Theorem 8.3, leads to a wide generalization in which 
the disc in M 2 is replaced by an appropriate convex set in 

Recall that a set is convex if whenever x and x f are in so is the line 
segment joining them. Suppose in addition that is a bounded set with 
C 2 boundary (in the sense of Section 4 in Chapter 7). Then whenever p 
is a defining function for Q, the second fundamental form (19) is positive 
semi-definite. (In fact, assuming the contrary, we can find a point on 
the boundary and coordinates (xi, … ， Xd) centered at this point, so that 
Xd is in the direction of the inward normal, and the quadratic form has 
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an eigenvalue Ai < 0 in the direction X \. Hence near the origin in this 
coordinate system, the intersection of ft with the plane determined by 
the X\ and Xd axes is then given by {xd > \\X^ 4 - o{x\)}, which is clearly 
not convex, contradicting the convexity of Q.) 

With this in mind, we say that is strongly convex when the 
quadratic form (19) is strictly positive definite at each point of the bound¬ 
ary of We denote by RQ the dilated set {Rx : x ^ Q} and write 
Nr 二 lattice points in RQ}. 

Theorem 8.4 Suppose Q is a bounded domain in R d with sufficiently 
smooth boundary. 15 Assume that is strongly convex and 0 G Then 

N r = R d m{n) 4 - 0(R d ~^) as R — oo. 

The proof follows closely the argument for Theorem 8.3. 

Proof, Let x denote the characteristic function of Q and \R that 
of so Xr( x ) — x{ x /R)- With (p a non-negative C°° function sup¬ 
ported in the unit ball that satisfies f <p(x) dx = 1, we set (ps {^)= 
S- d (p{x/S). We let xr,s = Xr * (ps, and set 

Nr ，5 = > : XR,s{ n )- 

nGZ d 

Now, by the summation formula (82) 

N R 、 5 二 R s m{n) + ^2 XrA 71 )^ 

77,7^0 

since 义 H ， 5 (0)= 义只⑼❷⑼， Xh(0 ) = R d x{0) = R d m(n), and (^(0) = 1. 
However x(0 = O (|C 「 平 ) by Corollary 3.3. Thus 

XR(n) = R d x(Rn) = O , 

so 

XR.6(n) = O (i? 〜半 ) (p{5n). 

Now we break the sum En^oX^(n) as Ei<|n|<i/5 + Ei/5<|n|- For 
the first term we use the fact that Xr^ = 0(R^~ \n\ 2 - and we es- 

d —1 d—1 . 

timate that sum by 0(R~^~5 ^ - )， (for example by comparing it with 

dx )' 

15 The proof will show that C d+2 suffices. See the remark at the end of Section 
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The second term leads to R^ 1 卜 | _ 爭 (|n|5) _r for any r > 0, 

since (p is rapidly decreasing. Choosing r sufficiently large, say r = d/2 

( d—1 d—1\ 

R~ 2 ~ S 2 ~ J for this part of the 

sum. Hence 

( 86 ) N r ， 5 = R d m(n) + O ^ . 

Next we observe that for an appropriate c > 0 
(87) N R - c s^s < Nr < Nr+ c s ， s. 

This inequality follows from 


XR~cS,S <XR^ XR+c5,5- 
The inequality on the right-hand side, 

Xr{^) < JXr^c5{x - y)^5{y) dy, 

is a consequence of the geometric observation that there is a c >0, so 
that whenever i? > 1, and (5 < 1, 


(88) x in RQ and \y < 8 imply that x — y G (R + c8)Q. 


The proof of this geometrical fact about the convexity of Q is outlined 
in Exercise 21. 

The inequality Xr-cS,s ^ XR i s seen in the same way. 

Now a combination of (86) and (87) show that 


N r = R d m{n) + O 早 ) + 0(i? d_1 (5). 


If we now choose 5 = then both O terms are O 

the theorem is proved. 


(R d_ 器) 


and 


8.3 Hyperbolic measure 

We turn to the improvement of (81) for the divisor problem that is anal¬ 
ogous to Theorem 8.3. 
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Theorem 8.5 

㈣ E d(k) = (i log fi- {- (27 - l)ji + log ji) as ^ oo. 16 

k=l 

Now as much as we might wish to follow the lines of the proof of The¬ 
orem 8.3, there are serious obstacles that seem to stand in the way. In 
fact, if Xfi is the characteristic function of the region 

(90) {(xi,X 2 ) G M 2 : X\X 2 < jU, x\ > 0, and 心〉 0}, 

which consists of the area on or below the hyperbola X\X 2 — then 
indeed 

k=l nGZ 2 

However the other side of the Poisson summation formula (82) for f = 
is problematic as it stands. In fact, X〆 。） = / R2 Xfi dx = oo, and for the 
same reason the integral giving each term x^(n) is not well-defined. 

A further issue is that the main term in (89) is plogp, while a simple 
scaling of the region (90) would suggest rather a term linear in Con¬ 
nected with this is the mysterious occurrence of Euler’s constant 7 in the 
subsidiary term. 

Now the essence of our analysis of lattice points in Dr (and formulas 
like (83)) are the facts about Fourier transforms of radial functions in 
two dimensions, which in turn depend on the Fourier transform of the 
invariant measure of the circle. In parallel to this we seek the analog 
where instead of radial functions we consider functions invariant under 
“hyperbolic dilations” （ ii ， i 2 ) — (<5xi,( 5 _ 1 X 2 ), (5 > 0 , and a correspond¬ 
ing invariant measure in R 2 supported on the hyperbola X 1 X 2 = 1 . 

We begin with the hyperbolic measure on M 2 , denoted by dl )，and 
which is defined by the integration formula 



valid for every continuous function / of compact support. Alternatively 

卵 、= f Xe(u,1/u) ^ 

Jo u 
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for every Borel set E in M 2 , with the integral taken in the extended 
sense. Note that the measure () is invariant under the scalings (Xi, 12) — 
( 6 x\, 6 ~ 1 X 2 ), for 5 > 0. 

Now the linear functional / J 0 °° f{u, l/u) ^ is well-defined for / G 

S in view of the rapid convergence of the integral, and moreover this 
convergence shows that the measure f) can be considered by this formula 
to be a tempered distribution. We seek to determine the Fourier trans¬ 
form of this distribution, and matters will depend on a pair of oscillatory 
integrals 3+ and 3—. These are given formally by 



u 


Since these integrals do not converge absolutely (either at 0 or at infinity) 
they must be considered as appropriate limits after truncation. 

For this purpose we pick 77 to be a non-negative C°° function on [0,00) 
with rj(u) = 0 for small and rj(u) = 1 if w > 1, and set rj a (u) — rj(u/a). 
We then define the convergent integral 

杧以 )= e S Va ( u ) Vb (i/ u )' 

’ Jo u 

with a similar definition for To begin with we take 0 < a, 6 < 1/2. 

Proposition 8.6 For each X H the limit ^ + (\) = lim a ， b—ex- 
ists. Moreover，uniformly in a and b，we have: 

⑴ 3+ 6 (A) = (Ef=o c fc A ' 1/2 ' fc ) e 2a +0(|A|- 3 / 2 - N ), for |A| > 1/2 

and for every N > 0 7 with Co,Ci,..., c^,... appropriate constants. 
(ii) 3^ 6 (A) = 0(log 1/|A|), for |A| < 1/2. 

Proof. We divide the integral 3^ 6 m ^° three parts as follows. Let a be 
a C°° function so that a(u) = 1 when 3/4 < u < 4/3, and a is supported 
in [1/2,2]. Set /3 = 1 — a so /3 is supported where u < 3/4 or w > 4/3. 
Then split 3^ as / + 11 + III, where 


11= f ㈤ — ， I 二 疒〜帅⑽咖 ㈤ ’, 

Jl/2 u Jo U 

and 

III = h e iX ^ u) P(u)rj b {u)^-. 
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Here we have written ior u \/u. 

Now we observe that $’(1) = 0, while > 0 for all so that u = l 

is the (only) critical point of Also, since $(1) = 2, we are led to make 
the change of variables = ul/u = 2 x 2 . Solving the quadratic 
equations involved gives 


u — 1 
u l f 2 


1 + g + x (±L x^ 


which shows that u ^ x is a, smooth bijection of the intervals [1/2, 2] 
with [― 2*' 1 / 2 ,2 1 ’ 2 ], 

Making the indicated change of variables, we see that the integral II 
becomes 


e 2iA / e tAx ~a(x) dx, 


A\x 


with 5 a C°° function of compact support. We now invoke the asymptotic 
formula (8) to obtain 


l/2-fc e 2iA + 0 (| A |-3/2-iV> )) 


for every > 0. 

Next, to deal with the integral /, we write 


L _ 1 d . 

du 

Then L{e lX ^) = e lA 中 ， and for every integer N > 1 

/* 3/4 du 

(91) 1= / L N {e lX ^)P(u)rj a {u )—. 

Jo u 

Let us first consider the case N ~ 1. Since ^ f {u) = 1 — 1/w 2 , and 
l/$’(w ) 二 u 2 /(u 2 — 1), integration by parts shows 





e i\^(u) 


— (uPii^riaiu)) du, 


where Pi{u) = P(u)/(u 2 — 1), and (3\ is smooth. 

Carrying out the differentiation leads to two terms. First, if the deriva¬ 
tive falls on f3i(u), the resulting contribution to / is certainly 0(1/|A|). 
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For the second, if the derivative falls on r\ a {u) the contribution is also 
0(1/|A|), since (rj a (u)Y = 0(l/a), and rj f a (u) is supported on [0, a}. This 
shows that / = 0(1/|A|). 

For > 1 we use (91) again, and carry out the integration by parts 
N times. Now at each step we get a gain of a factor of u and a possible 
loss of a factor of a _1 , the latter occurring when r) a is differentiated. 
So altogether this shows that I = 0(|A| _iV ) for each positive integer N• 
The integral III is similar to that of /, as can be seen by transforming 
it by the mapping u 1/u. So we also have III = 0(|A| _iV ), and hence 
conclusion (i) of the proposition is proved. 

Next, when |A| < 1/2, since II is obviously bounded, we need only 
estimate / and III. Turning to I we write as before 



1 


z* 3 / 4 

I e iX ^ — {u^u)^)) du 

rW x 尸 3/4 

Jo ^ J\x\ 


But the first term is majorized by a multiple of 

1 ,1 入 I 

习 (丄 + u ^ u ^ du = W 1 )， 

while the second term can be written 

f 3 / 4 w ?/ 

/ e iX ^P(u) Va (u) —+0(1), 

^|A| U 

which is clearly O (/|% 4 奢 ) + 0(1) = 0(log 1/|A|). The estimate for III 

is similar, so conclusion (ii) is established. 

To prove the convergence of 3^ as a, 6 ^ 0, note that the integral II 
is independent of a and b. Now consider / and recall that it depends 
only on a. We have 

h ~h f = J e iX ^ {u) (r) a (u) - r\ af {u))(3{u) 

and the integrand is supported only on (0, max(a, a 7 )). Now as before 
la- Ia f = -~ J ^(e iA ^ (u) )(?7a(^) - Va f {u))upi(u) du 
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and an integration by parts shows that this difference is O ( 而 max(a, a’)) • 

Since A is fixed, A ^ 0 , this tends to zero with a and a’，and so I a tends 
to a limit as a —> 0. The term III is treated similarly, and hence 3^ 
tends to a limit, proving the proposition. 


A similar result holds for 3 二 ， except for one change. 

Corollary 8.7 The conclusions for 2~ b are the same as those for 
stated in Proposition 8 . 6 ， except that (i) should be modified to read that 
uniformly in a, b, 

( i； ) = 0(|A| _iV ) for |A| > 1/2, for every N >0. 


The only change occurs in the treatment of //, namely f e 1,A ^^a(u) 
where now ^(u) — u — \ ju. In this case ^ f (u) = 1 + 1/u 2 > 1 , and there 
is no critical point. So Proposition 2.1 implies that II = 0(|A| _iV ) for 
every > 0 , and then conclusion (i’）follows by the arguments we have 
used for / and III previously. 

Remarks. Two further observations about 3^ are straight-forward 
consequences of the arguments given above. 

1 . and are both continuous in A if A 7 ^ 0. 

2. The uniformity of the estimates (i), (i’）and (ii) holds in the wider 
range 0 < a < 00, 0 < 6 < 00, with the only change being that in the 
asymptotic formula in (i) the constants cjt may depend on a and 6 , but 
are still uniformly bounded. For example, when a < 1/2 but now b is 
unrestricted, then in the term II, a(u) is replaced by a{u)r\{\/(bu))^ 
which is still uniformly smooth when b > 1/2. In /, the function (3\{u) is 
replaced by (3\{u)r}{\/(bu)) with the same effect. This reasoning clearly 
applies when both a and b are large. 


8.4 Fourier transforms 

We now come to the Fourier transform off). It is convenient at this point 
to change our notation slightly, so that a general point (xi, X 2 ) of M 2 will 
now be denoted instead by (x, y), and similarly the dual variable in M 2 
will be denoted by (^, rj). 17 

We divide the plane M 2 into its four proper quadrants Qi, Q 2 , Q 3 and 
Q 4 , (together with the x and y axes) with Q\ = {(x, y) : x > 0 and y > 
0 }, Q 2 = {(x, y) : x <0 and y > 0 } and so on. 


17 This will reduce the burden of subscripts in some of our formulas. 
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a-(-2 7 r|^i i /2) 

3 + (2咖" 2 ) 

3一(2兀| 幼 I 1 / 2 ) 


Proof. We approximate f) by the finite measures f) e given by 

[f di)e = 

Jm 2 

Now clearly f f di) e — f fdt) as e 0, whenever / G 5, so the mea¬ 
sures f) e converge to f) in the sense of tempered distributions. Now 

U^v)= /°V 2 . 一 ) r? “ 咖 e (l/u)’. 

Jo u 

Suppose first (^, rj) is in Qi and therefore ^ > 0 and r) > 0. Keeping (^, rj) 
fixed, we make the change of variables u i—» (n/O 1 / 2 以 . Then + rj/u 
becomes 4 - \/v )、 while rj e (u) = rj(u/e) is transformed to rj a (u), 

with a = e(^/? 7 ) 1 / 2 , while tj € (1/u) becomes 7]b(l/u), with b = e(? 7 /^) 1 / 2 . 
Also, the measure is unchanged. So 

= 3:， b (-27 T |《7?| 1/2 ) 

in the first quadrant, with analogous formulas in the other three quad¬ 
rants. 

Now the conclusions (i), (ii), and (i 7 ) of Proposition 8.6 and its corollary 
show that 

I ⑽，巧 )1 < ^4|^| -1/2 for \^7]\ > 1/2, 

< Alog(l/|^|) for \^rj\ < 1/2, 

A 

uniformly in e. Moreover, for each (^, rf) with ^ 0, rj) converges 

A 

to a limit as e — 0. This suffices to show that f) e converges in the sense of 

As /S 

tempered distributions to the function f) given by lim e _>o f)e($ W). This 
is because the above estimates imply that 

/ i)e9 / bP, for any g eS, 

Jr 2 J 

by the dominated convergence theorem. Thus the proposition is proved. 


f{u : l/u)r) e {u)rj e (l/u) 


du 


u 


Proposition 8.8 The Fourier transform f) (taken as a tempered distri¬ 
bution) is a continuous function when ^77 ^ 0 and is given by 


12 3 4 

Q Q Q Q 

n n n n 
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which has a determinant equal to 2p/u. Therefore dx dy — 2p^dp and 





e -2 峨一心輪 dp . 


Again, if (4, r/) is in the first quadrant and if we make the change of 
variables u then we have 


(93) 


f 此 V) = 2 / 0^ 6 (-27r|^/| 1/2 p)/o(p 2 )pdp ， 


with now 


⑷ 


and b 


⑴ 


We next study the Fourier transform of functions in R 2 that are in¬ 
variant under the dilations (x^y) (<5x, with 5 > 0. We state the 

result for a restricted class of smooth functions of the type that is needed 
below, although the main identities hold for broader classes of functions. 
We will suppose that / is of the form f(x,y) = fo(xy) in the first quad¬ 
rant, and vanishes in the other three quadrants. The function /o will be 
assumed to be a C°° function with compact support on (0, oo). Functions 
/of this form are never integrable on the whole of M 2 (unless /o = 0) 
but since they are bounded, they are of course tempered distributions. 

A 

Theorem 8.9 Let f be the Fourier transform off(x,y) = /o (xy). Then 

A 

f is a continuous function where ^rj ^ 0. It is given by 

(92) fU ， V) = 2 / d + (-2rr^7]\ 1 / 2 p)f 0 (p 2 )pdp 

Jo 

for (^, rj) G Qi. In Q 2 , Q 3 ，and Q 4 it is given by the analogous formulas ， 
with 3 + ( —•) replaced by 3 _ ( —*)? 3+(+.) and 3 _ (+*)? respectively. 

Proof. We approximate / by / e , with f € (x,y) = fo(xy)rj € (x)rj € (y). 
Then each f € is a C°° function of compact support, and clearly f e — f 
in the sense of tempered distributions. 

Now 

f 此 rj ) 二 j e~ 2ni ^ x+vy) f 0 (xy)r) e (x)r) e (y) dxdy. 

We introduce new variables (u^p) in the first quadrant with x = up, 
V = and observe that 

^ u , 
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A 

The analogous formulas for / e (^, r\) hold when (^, r\) are in the second, 

A 

third and fourth quadrants. So the fact that f € converges in the sense of 

A 

tempered distributions to the limit / given by (92) then follows by the 
same reasoning used in the proof of Proposition 8.8. 

A A 

Corollary 8.10 The Fourier transforms f € and f satisfy the following 
estimate，uniformly in e: 

(94) |A(^,^)| < A N \^rj\~ N when \^r)\ > 1/2, 
for every > 0. 

This is a consequence of the asymptotic behavior of 3 土 ( 入 ） for A as 
given in Proposition 8.6 and its corollary together with the fact that 
J 0 °° e— 47np l 《 ”l 1/2 /o(p 2 )pdp is 0(\^rj\~ N ) for every > 0, since fo(p 2 )p is 
a C°° function with compact support in (0, oo). 

8.5 A summation formula 

Here we obtain the hyperbolic analog of the summation formula (83). It 
will be convenient now to put together the oscillatory integrals for the 
four quadrants and write 3 for 

a(A) = 2 (a + (A) + a + (-A) + a~(A) + a-(-A)). 18 

Again /◦ is a C°° function with compact support in (0, oo). 

Theorem 8.11 

OO pOO oo 

(95) ^f 0 (k)d(k) = / (logp + 2 7 )/ 0 (p) F o(k)d(k), 

k=l D k=l 

where 

POO 

Fo(u) = / Z(27ru 1/2 p)fo{p 2 )pdp. 

Jo 

Proof. We apply the Poisson summation formula 

XI n ) = /“ m ， n )， 

z 2 z 2 

to the approximating functions / e , and then pass to the limit as e — 0. 
Now the sum on the left-hand side is clearly taken over a bounded set of 


18 The expression of 0 in terms of Bessel-like functions is given in Problem 7. 
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lattice points since fo(u) has compact support in (0, oo). Thus, gathering 
together the (m, n) for which mn = k gives the left-hand side of the 
formula. 

Now divide the sum on the right-hand side in two parts. One part 
taken over those (m, n) for which mn ^ 0, and the other part taken over 
those (m, n) where either m = 0, or n = 0, or both, m = n = 0. 

By the theorem and Corollary 8.10, we see first that 

lim V f € {m,n)= V /(m ， n), 

mn^O rnn^O 

since the series are dominated by the convergent series Ylmn^o \ mn \~ 2 - 
Next, gathering together those (m, n) for which \mn\ = fc, gives us 

OO 

F 0 (k)d(k) 

mn^O k=l 

because of formula (92). 

It remains to evaluate the limit as e —> 0 of 

(96) 

mn=0 

Now, one part of (96) is Ylm /e( m ,0) ， which, by the Poisson summation 
formula (this time in its one-dimensional form) equals 

m 

However / € (x, y) = /o(xy)? 7 € (x)? 7 € (t/) and f e is supported in the first quad¬ 
rant, so this sum is 

oo 广 oo 

^2 / dy. 

771=1 

Upon making the change of variables my y m the integral and inter¬ 
changing the summation and integration (which is easily justified), we 
see that the sum becomes 

K{y)h{y)dy, 

with k € (y) = 〜 ( 以 / 爪）士 ， when we take 0 < e < 1. (Note that then 

rj e (m) = 1 if m > 1.) 
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We claim that if c。= J 。 1 r/(x) 夸 , then 
( 97 ) k € (y) = \og(y/e) + 7 + c 0 + 0(e/y) as e ^ 0 , 


and this estimate is uniform as long as y ranges over a compact subset 
of ( 0 , a). 

To see this we divide the sum k e (y) in two parts: where the sum¬ 
mation is taken over m with m < y/e, and the complementary part. 
Since r] € (y/m) = r](y/(em)) = 1 when m < y/e, that part of the sum is 
YU<m<yle V m which equals \og{y/e) + 7 + 0 (e/y) by the defining prop¬ 
erty of Euler’s 7. 19 

On the other hand, 


m>y/e 


f rj{y/{eu)) 

u>y/e 


du 

u 




because ^ {v {^) = 0 (l/w 2 ), which in turn follows since r) f (u) is 

compactly supported in ( 0 , 00 ). As a result (97) is established with 



By symmetry we also get 

pOO 

^/ e ( 0 ,n) = / k € (y)f 0 {y)dy, 


with k e given by (97). 

A _ — A 

It remains to evaluate / e (0,0), which is the excess of E m /e ( 爪， 0) + 

A( 0 , 爪 )■ Errm 二 0 /V ，几 ) . 

However, 


/ e ( 0 , 0 )= / f € (x,y)dx dy 
f R 2 


f 0 (xy)7] e (x)rj e (y) dxdy 


R 2 


K(y)fo(y)dy, 


0 


with k f e (y) = 7](x/e)ri(y/(ex)) as a simple change of variables shows. 


19 See, for instance, Proposition 3.10 in Chapter 8 of Book I. 
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Now divide the integration in x into four parts: where both x/e and 
y/(ex) are > 1; where one is > 1, but the other is < 1; and where both 

are < 1. The first part gives f^ € — = logy — 21oge, since r](x/e) = 
1 and rj(y/(ex)) = 1 there. Next if x/e < 1 but y/(ex) > 1 the inte¬ 
gral is J Q e ri(x/e) — = rj(x) ~ = c 0 . A similar evaluation holds when 
y/(ex) < 1 and x/e > 1. Finally the last range of x’s is empty when e is 
sufficiently small since x < e implies y/(ex) > 1, whenever e < y and y is 
bounded away from 0. Thus 

(98) k f e {y) = logy - log2e + 2c 0 . 


Altogether then 

__ pOO 

(2/c e - k ; e )f 0 (y)dy, 


and because of (97) and (98) this converges to J 0 °°(logy 4 - 2j)fo(y) dy as 
6-^0. Theorem 8.11 is therefore proved. 

We come now to the proof of the main theorem, whose conclusion is 
stated in (89). Here we would like to apply the sum formula (95) to 
/o = X/ 2 ，the characteristic function of the interval (0, jj). However this 
function does not have the smoothness required for the validity of (95). 
We are guided instead by the reasoning used in the proofs of Theorems 8.3 
and 8.4 that suggest we regularize in an appropriate way. 

To proceed, let us note that in the sense that Theorem 8.3 and (89) 
in Theorem 8.5 are parallel, we have to think of p as playing the role 
of R 2 . Indeed, setting /j, = R 2 will lead us to the proper choices below. 
With this in mind we want to replace by a function which is 
defined so that effectively XM ⑴ =1 if 0 < t < jU, that is, Xfi,s(p 2 ) = 1 
if 0 < p < i? = and moreover so that Xfi,s(p 2 ) decreases smoothly 

to zero in R < p < R-\- 8. Here 8 is the quantity i? -1 / 3 that arises in the 
proof of Theorem 8.3. 

To give the precise definition of Xfi,6 we fix a C°° function ^ on [0,1] so 
that 0 < ip < 1, with ip = 0 near the origin and — 1 near 1. We define 


{ -0(p) for 0 < p < 1, 

1 for 1 < p < i?, 

1 ~ 7p for i? < p < i? + (5. 


Now consider the sum formula (95) with fo(u) = - Then the in¬ 

tegral term on the right-hand side is J 0 °°(logp + 2^y)x^,s(p) dp^ which is 
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equal to 



(logp+ 27 ) dp- {- 0 ( 1 ) + O 





1/3 


log p dp 



because R 2 = fi and (R + 5) 2 — (R-\- i ? _1 ’ 3 ) 2 = /i + 0(/i 1 / 3 ). Thus the 
integral equals 


(99) ^log^ + (27 - l)/i + 0 (^ 1/3 log (i). 

We now estimate each term J 0 °°3(27r/c 1 / 2 p)/o(p 2 )pdp that arises in the 
sum on the right-hand side of (95) with / 0 (p 2 ) = X^,s(p 2 )^ We make two 
estimates for this term, with R = /i 1 ’ 2 : 

(a) 0 (i? 1 / 2 //c 3 / 4 ) and 

(b) 0(i? 1 / 2 (5- 1 /fc 5 / 4 ). 

To see this consider the main contribution to 3( 入 ） for large A as given 
via (i) and (i f ) in Proposition 8.6 and its corollary. This is the term 
Co 入 _ 1 / 2 e 2lA . Thus for its contribution we need to estimate 


( 100 ) 



e z< 7 p XfiAp 2 )p l/2 d P^ 


where we have set a = 士 2 • 2irk 1 ^ 2 . 

First since e z<7p = we may integrate by parts in (100) and 

see that ( 100 ) is majorized by a multiple of 

( rR pR-\-S 

j p- 1/2 dp + j p 1/2 dp 

because Xfi,s{p 2 ) = 1 for 1 < p < i?, and j^X^s{p 2 ) = 0(1/5) for R < 

p < R-\- 8. This gives us the estimate 0(cr _ 3 / 2 i? 1 / 2 ) = 0(/c~ 3 / 4 i? 1 / 2 ) 
and this is (a) above. If instead we integrate by parts twice we see 
that ( 100 ) is majorized by a multiple of 



o 


-5/2 


d_ 

dp 


2 


(xm(p 2 )p 1/2 ) 


dp. 


However ( 悬 ) (x^s(p 2 )p l/2 )= : 0 ( 1 ) when 0 < p < 1 ; it is cp _5 / 2 when 
I < p < R; and 0(i? 1 / 2 (5~ 2 ) when R < p < R 8. So we obtain the 
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bound of the form a -5 / 2 (0(1) + = 0(a~^^ 2 R 1 ^ 2 6~ 1 ) for (100). 

Thus we have established the bounds (a) and (b) for the main contri¬ 
bution coming from the first term in (i) of Proposition 8.6. The other 
terms in the asymptotic series give obviously smaller contributions, and 
we need only go as far as iV = 1 in the formula (i), because then the error 
term will contribute less then either (a) or (b). Thus the estimates (a) 
and (b) have been established for the individual terms of the series on 
the right-hand side of (95). 

Our conclusion is then that modulo an error term that is 0(〆, 3 log ") 
we have 

( 101 ) y^Xfi,s(m,n) = "log"+ (27 - 1)M+ 

+0 I R l/2 ^ d(k)k~^ 4 -hR l/2 8~ l d{k)k~ b ^ 

\ l<k<l/S 2 k>l/S 2 

Now it is a simple fact that 

d(k)k a = O (r a+1 logr) as r —> 00 , if a > —1, 

l<fc<r 

and 

d(k)k a = O (r a+1 log r) as r — 00 , if a < —1. 

r<k 

(The proof of this is outlined in Exercise 22.) Taking r = 1/6 2 = i? 2 / 3 
and a = —3/4 or a — —5/4, the above shows that the O term in (101) 
is majorized by a multiple of 

(i? 1/2 i? 2/3 * 1/4 + i? 1/2 i? 1/3 i? _2/3 . 1/4 )logi? = 2R 2/3 logR. 

Now if we set Ns(R) = n XM( m ， n )，with /i = R 2 , then (101) states 
that 

(102) N S {R) = R 2 log R 2 + (2 7 - l)i? 2 + 0(R 2/3 logR). 

However by the way has been defined it is clear that 

X(R-S) 2 ,S < X/2 < X(i?+<5) 2 ， <5 ， 

with ji — R 2 . Thus 

N s (R-6)< d(k)<N s (R + 5). 

1 
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If we look back at (102) we see this implies 

d(k) = MlogM+ (27 - 1)M + 0(" 1/3 log ")， 

since ji—R 2 and 8 = i? 一 1 / 3 . Therefore our main result is now estab¬ 
lished. 

9 Exercises 

1. Use spherical coordinates to show that in 


da = c d I e ~ 2 ^ lu (l - u 2 )^ du, 


5 d - 


with Cd the area of the unit sphere S d ~ 2 in E d_1 . Then deduce formula (3) from 
Problem 2 in Chapter 6, Book I. 

2. Let the hypersurface M contain a neighborhood of a hyperplane (for example 

{xd = 0}). Show that in this case ^ 0(|^| _e ), as |^| —> oo for any e > 0. 

3. Principle of stationary phase when d = 1. Consider 


J ㈧ 


dx, 


where -0 is a C°° function of compact support and 工 = 0 is the only critical point 
of 少 in the support of -0, while 少〃 (0) # 0. Then for every positive integer N, 


/(A) 


e 


1X^(0) 


入 1/2 


(a 0 + aiA -1 + … + a N X~ N ^j + 0(\~ N ~ l/2 ), as A->oo. 


The afc are determined by $’’(0)，...，$( 2fc+2 )(0)，and -0(0),..., ^ 2k \0). In par- 
/ \ 1/2 

ticular a 0 = ( 二； (0) ) #(0). 

Prove this in two steps: 

(a) Consider first the special case when <p(x) — x 2 dealt with by (8). 

(b) Pass to the case of general (/? by a change of variables that brings <p(x) to 

2 2 
x or —x . 


4. Suppose $ is of class C k in an interval [a, 6] with k > 2. Assume that | 少⑻ (t)| > 
1 throughout the interval. Prove the following generalization of Proposition 2.3 


e 


iA4>(x) 


dx 


a 


< c k \ 


一 1 / fc 
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[Hint: Suppose ^>^ -1 ^(xo)= = 0， and argue by induction as in the proof of Propo¬ 
sition 2.3.] 

5. Consider the curve 7 (t) : = (t,t k ) in R 2 , with k an integer > 2. Its curvature 
vanishes nowhere when k = 2 and only at the origin, of order k _ 2, when k > 2. 
Let d/i be defined by / R2 /d" = f R f(t^ t k )^(t) dt^ where ^ is a C°° function of 
compact support, and ^(0) ^ 0. Then prove: 

(a) _i = o(ier 1/fc ). 

(b) However, this decay estimate is optimal, that is, |d^(0,^)| > c|^ 2 | _1 ^ if ^2 
is large. 

[Hint: For (a) use Exercise 4. For (b) consider for example the case when k is even 
and verify that / 二 e lXxk e~ xk dx = ca(1 — zA) -1 ^.] 

6. Show that the (L p , L q ) results for the averaging operator A given by Corol¬ 
lary 4.2 are optimal by proving the following (in, say, the case of the sphere in 
R 3 ): 

(a) Suppose f(x) vanishes for small x and f(x) > | 工 「 r , for |a:| > 1. Then ob¬ 
serve A(f)(x) > c|a:| _r and thus we must always have q>p- This restriction 
corresponds to the side of the triangle joining (0,0) and (1,1). 

(b) Next let f = Xb s ^ where Bs is the ball of radius S. Note that if 6 is small 
Mxb s ) > c6 2 for |1 - |a:|| < 8/2. So ||/||lp ^ S 3/p while \\A(f)\\ L <i > 6 2 6 1/q . 
Hence the inequality ||A(/)||£ / g < cH/H^p implies 21/q > 3/p, which cor¬ 
responds to the side of the triangle joining (3/4,1/4) and (1,1). 

(c) For the third inequality, use duality and (b). 

7. By refining the argument given in Exercise 6 (b) one can show that the smooth¬ 
ing of degree (d — 1)/2 asserted in Proposition 1.1 fails when p ^ 2. 

In the case p < 2 and d = 3, this can be seen by taking <5 > 0 small and setting 
/ = where ips = <p(x/6) and p is a non-negative smooth function of compact 
support. Here ||c^« 5 ||lp « cd 3 / p , while \\VA((ps)\\lp ^ 55 l ^ p . Hence the inequality 
||A((^5)|| L P( R 3) < CH^Ulpcr 3 ) fails for small delta when p < 2. 

[Hint: If ci > 0 is sufficiently small, then 5 2 < 乂 ( 外 ） and |VA((^« 5 )| > <5, whenever 
|1 — |a:|| < ci<5.] 

8. Let M be a (local) hypersurface given in coordinates (x^Xd) € R d_1 x R as 
{xd = c^(a: / )}. Suppose F is any continuous function of small support defined in a 
a neighborhood of M and set / = F\m- 

(a) Show that linie—o 占 f d ( x M ) <e F dx exists and equals f Rd -i /(〆 ， W ( 工 ’))(1 + 

lVx/^1 2 ) 1 ^ 2 dx f . This limit defines the induced Lebesgue measure dcr and 
equals f M f dcr. 
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(b) Suppose p is any defining function of M. Show that 


lim — 

e ― ^0 2c 


F dx 


|p|<e 



(c) Suppose /i is a Schwartz function on R with f R h(u) du = 1. Then 


化 ， 1 L， /e 、 Fdx 二 


[Hint: For (c), assume h is even and let I(t)= 




F(x) dx. Then 


€ 


h(p/e)Fdx = e~ L I ° h 、 u/e ) 十 du 

f q du 


e - 1 (u/e)h f (u/e) ( -I u du 


o 


u 


Now use the fact that — J 0 °° uh f (u) du = 1/2 and ^ f M /j^y, as u ^ 0.] 


9. Observe the following Euclidean-invariance properties of the principal curva¬ 
tures of a hypersurface M in R d . For each h €M. d consider the translate M h oi 
M; also for each rotation r of R d , the rotated surface r(M); and for each <5 € M, 
<5^0, the dilated surface 8M. Denote by {Aj(a:)} the principal curvatures of M 
at x. 

(a) Show that {Aj(a: — h)}, {Aj(r -1 (a:))}, and {5~ 2 \j{x/5)} are the principal 
curvatures of M + h, r(M), 8M at the points x h, r(x) and 6x^ respec¬ 
tively. 

(b) Consider the cone {x^ = |a: / | 2 , x ^ 0} with defining function p = | 工 ’| 2 — 
Using (a), show that at x there are d — 2 principal curvatures that equal 

, and one that vanishes. 


10. Let / 0 (r) = r -1 ^ 2 (logr) -<5 , 0 < <5 < 1, when r > 2, and / 0 (r) = 0 otherwise. 

(a) Prove that f \Jk(27vpr)\fo(r) dr = oo for every p > 0. 

(b) Show as a result that, if p > 2d/(d + 1), then (31) cannot hold for any q 
when M is the sphere. 


11 . 


One can prove that the conjectured condition q < 


( 詰 )〆 for (L P ， P) re¬ 


striction cannot hold in a larger range, by the following argument given in the case 
d = 2. 


(a) Suppose the inequality (31) holds for some p and q. Show that as a result 


i-«5<kl<i 


l/(Or^<c^||/||l P for small 
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(b) Next choose /(^i, ^ 2 ) = 77 ((^ — 1 )/ 8 )t](^ 2 /S) when rf(u) = 1 if |u| > 1. That 
is, / ⑹ dominates the characteristic function of a rectangle of approximate 
side lengths <5 and that fits inside the annulus 1 — △ < |C| 幺 1- Use this 

to obtain a contradiction q > P by letting <5—^0. 


12. Connect the operator e ltA and the Fourier transform as follows. Let mt be 

■j i 111 2 

the multiplication operator mt : f(x) ( 4 丌 e f(x). 


(a) Show that 〆△(/) 


d. 


Tnt(fmt) A when t — 1 / 47 T. 


(b) Generalize this identity to any t ^ 0 by rescaling. 


13. Let Ai(u) = lim^. 


2n 


rN 

J-N 


+ uv 


e 


dv. 


(a) Show that this limit exists for every u G M. 

(b) Prove that |Ai(u)| < c(l + |u|) _1 〆 4 . 

(c) Moreover, show that A\(u) is rapidly decreasing as u —> 00 , for u > 0. 

3 

[Hint: Write $(r) = ^ + ru, and apply the estimates in Section 2. For (a) use the 
fact that ^(r) —^ 00 as |r| ^ 00 . For (b), use the fact that | 少 ’(r)l 2 M/2 ， when 
|r| < (^lul) 1 / 2 , while > 2|r| when |r| > (士卜 I) 1 〆 2 -] 

14. Suppose F G L 2 (R d x R) and 5(F)(x, t) = i e^ t_s ^ A F(-, s) ds. Prove that: 
(a) For each t, S(F)(^t) G L 2 (R d ), and 

l|S(F)( • ，亡 )IL 2 (R d ) $ l^| 1 ^ 2 |l^llL 2 (R £i xR)- 


(b) If = e ltA G(-,t), then 


116^(0, tl) — ^( 0 , t 2 )||£ / 2 ( R d) < \tl — t 2 \ 1 ^ 2 \\G\\ L 2 ^ R d x ^ 


(c) As a result, t ㈠ F( 0 , t) is continuous in the L 2 (R d ) norm. 

[Hint: For (a) and (b) use the unitarity of e ltA and Schwarz’s inequality. For (c), 
approximate F by C°° functions of compact support, using (b) and (c).] 

15. Suppose u is a smooth solution of (54) that decays sufficiently quickly as |x| ^ 
oo. Show that both f Rd \u \ 2 dx, and / Rd (^|Vu | 2 — |u| A ) dx are independent of 

t. 

[Hint: For the first, note that f Rd Auvdx = f Rd uAv dx. For the second, observe 
that ^ f Rd I Vu | 2 dx = — f Rd (费 An + §7 An) dx.] 
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16. The following is a converse of Propositions 6.6 and 6.8. Suppose u( ， t) is 
in L 2 (R d ) for each t, with t > u(. ， t) continuous in the L 2 norm, and u(- ， 0) 二 0. 
Assume that | 當一 An = F as distributions, with F G L 2 (R d x R). Then show 
that u = S(F). 

[Hint: Use the following fact. If //(•, t) is in L 2 (R d ) for each with t //(•, t) 
continuous in the L 2 norm, //(•, 0) = 0, and ^ = 0 in the sense of distributions, 
then H = 0. Apply this to 二 e _ltA (u(-, t) — 5(F)(-, t)).] 

17. A solution u of the non-linear Schrodinger equation (54) is uniquely determined 
by its initial data /. Moreover the solution depends continuously on this data. 
These are two features of the “well-posedness” of the problem and can be stated 
as follows. Assume A = and q = 2 d J ~ 4 . 

(a) Suppose u and v are two strong solutions defined for \t\ < a, having the 
same initial data / G L 2 (R d ). Show that u = v. 

(b) Given / € L 2 (R d ), prove that there are e > 0 and a > 0 (depending on /) so 
that if ||/ — ^||^2 < e, and u and v are strong solutions of (54) with initial 
data / and g respectively, then 

11^ — < c||/ - 分 || L 2 (R d). 

Here L q = L q (R d x {\t\ < a}). 

[Hint: Adapt the argument in Theorem 6.9, and for (a) proceed as follows: note 
that for small £ > Q 

ll^llx^ ^ and v l^(rx/) ^ ^ 
for all intervals / of length < 2i. Thus 

\u — vWl^ < \\M{u) — M(v)\\l<i < ^ \\u — vWl^, 

La 

with L q = L q (R d x {\t\ < £}), and so u = v iov 0 < t < £. Now use the ^translation 
invariance to apply the same argument for u(，, t -h £) and v(',t + £), and so on. 

For (b) note that by choosing a and e sufficiently small ||e ltA /||£ / g < 5/4 and 
then ||e l * △ 分 < 5/2, where L q = L q (M. d x {\t\ < a}). Now the iteration argu¬ 
ment shows that the solutions u and v satisfy ||u||l<? ， ||u||l<? 〈占 . Also \\u — v\l<i < 
||5(|u| a-1 u - M 入 + c||/ - g\\ L 2 . But ||S(|u | 入 _1 u - |v| A-1 v)||L g < \\\u - 
vWl^, so this proves (b).] 

18. Consider the Radon transform IZb defined by 

,x d ) = f f(y’ ， x d - B(x’ ， y’) ） dy’ 

x — (x f , Xd) ^ R d_1 x R, where 5 is a fixed non-degenerate bilinear form on R d_1 x 
R d ~ 1 . We write B(x , y f ) 二 = C(x) - y , and assume that the dimension d is odd. 

Verify that: 
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( a ) II (af^) ^s(/)||^ 2(Rd) = cs||/||^ 2 for every f eS, with cs = ^^01 - 

(b) If {IZb)* is the (formal) adjoint of IZb^ then 1Z* B = IZb* , with B*(x ， y ) 二 

-B(x,y). Also 二 Hb 長 . 

(c) Deduce from (a) and (b) the inversion formula 



TCbIIbU) 


=Cs/. 


19. We take the Radon transform 尺丑 as in the previous exercise (with the di¬ 
mension d odd) and consider a localized version of it, lZ f B ^ given by 

丸 B = V^bM) 

where t] and v[ are a pair of C°° functions of compact support. Show that: 

( a ) II^s(/)IIl 2 ^ c II/IIl 2 - 

(b) (^) a ^b{}) is a finite linear combination of terms of the form 
(af^) (” 沐 s(”£(/))) with 0 g G |a|. 

(c) Deduce from the above and part (a) of the previous exercise that / ^ 
is a bounded linear transformation from L 2 to L 2 d _'. 


20. The averaging operator from Section 7 satisfies the L p , L q conclusions stated 
for the operator A in Corollary 4.2. Prove this by proceeding according to the 
following steps. 

First, recall that A — Ak^ with Ak given by (65) in Section 7.4, with the 

sum convergent in the L 2 norm. Now fix r and consider 

r 

T s = (l-2 1 - s )e s2 ^2 -fcs A. 

fc =0 

Note that To = — and so it will suffice to make L p L q estimates for To 

that are independent of r. Now prove: 

(a) ||T s (/)|| L2(Rd) < M||/|| L2(Rd) if Re(s) = - 早 . 

(b) ||T' s (/)||£ / cx ： (Rd) < M||/||li (的 if Re(s) = 1. 

Once (a) and (b) have been established, an interpolation via Proposition 4.4 yields 

\\To(f)\\L^ <M\\f\\ LP , 

with p =and q — and this leads to the desired conclusion. 
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[Hint: Part (a) follows from the estimates (70) and (71) for a = 0, and the almost- 
orthogonality argument, Proposition 7.4. To prove (b) note that it suffices to 

prove that (1 — 2 1 _ 5 )e s2 5^:= 。 2 _fc 、 (2 _fc u) has a bounded Fourier transform if 
Re(s) = 1. Let v be the dual variable to u. We assume first |v| < 1. Let ko be the 
integer for which 2 k ° < l/|v| < 2 fco + 1 . Now 

f r)(2- k u)e 2 ^ uv du= Z Z • 

k — 1 /c</cq /c〉fco 


In the first sum, write e 2lTluv 二 1 + 0(|u||v|), and recall that 77 ( 7 ) is supported in 
1/2 < I 7 I < 2, thus 



where c = f rj. However ^2 k<ko 2~ ks 2 k is 0(1/|1 — 2 1_s |) if Re(s ) 二 1, while the 
second term above is (when Re(s) = 1) 


◦(H) [ 2 ~ k / h(2 — 、 )| 卜 | 如 = 0(1^1) [ 2 fc = 0(1). 


k<k 0 


/c</cq 


Finally for the second sum, ^ fc>fco , integrate by parts, writing e 2ntuv as 
_X_d_^iuv^ tQ obtain a sum that is O ^ E fc > fco 2 ~ k ) = 0{2~ kQ /\v\) = 0 ( 1 ). 
If | 叫 〉 1, take fco 二 0, and argue similarly.] 


21 . Suppose Q is a bounded open convex set with 0 ^ and with C 2 boundary. 
Then there is a constant c 〉 0 so that whenever R> 1 and 5 < 1, then x G RQ 
and \y\ < 8 implies x y G (R 4 - c5)r2. 

[Hint: One may reduce to the case R = 1 by rescaling. To see, for example, that 
there is a ^ so that x 4 - 1 / = (1 + fi8)Q whenever x G dQ and |y| < 6 for <5 sufficiently 
small, proceed as follows. By a Euclidean change of variables, introduce new 
coordinates so that x has been moved to ( 0 , 0 ) G R d_1 x R, and near that point Q 
is given by Xd > (p(x’）, with 0(0) = 0 and V x /c^(0) = 0. Then by convexity of Q, 
the point corresponding to the initial origin is given by (z , Zd) with Zd > d > 0 . 
Also x y G (1 + (iS)Q is equivalent with 

Vd + fi6z d ( y + fi6z f \ 

1 4- ^ V 1 + / 

Since \yd\ < S, the left-hand side is > 号 , as soon as > 2/ci. Fix such a fi. 
Now the right-hand side is dominated by 

A y + 2 < v ^ 2 + (^) 2 \ 

1 + fiS ~ \ 1 4 * ^ J 
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and we need only choose 6 < C2 /"， for appropriately small C2 


22. Prove the following two estimates for r —► 00 : 

(a) J2i<k<r d ( k ) k0t = 0 ( rQ+1 log r ) if a > -1 ‘ 

(b) J2 r <k d(k)k a = O (r Q+1 logr) if a < -1. 
[Hint: Write 


Ed(k)k a = EE (mn) Q = 

k>r mn>r n 



O ( n a min(l 5 (r/n) 


a+l 


23. Prove that rJi(r) = fj aJo(a) da^ by verifying the following: 

(a) J[(r) = ^ (J 0 (r) - J 2 (r)). 

(b) Ji(r) = f(J 0 (r) + J 2 (r)). 

The above shows that rj[(r) + Ji(r) = rJo(r), so (rJi(r)) = rJo(r), proving 
the assertion. 

[Hint: Recall that Jm(r) = ^ / Q 27r e trsinG e~ tTne dO. For (a), differentiate in r under 
the integral sign. For (b), write e l ° = ~\~hi e ~ ld ) an d integrate by parts.] 


10 Problems 

The problems below are not intended as exercises for the reader but are 
meant instead as a guide to further results in the subject. Sources in 
the literature for each of the problems can be found in the “Notes and 
References” section. 

1.* Suppose M is a local hypersurface in In a neighborhood of a point xo € M 
one can choose a smooth vector field defined in this neighborhood restricted 
to M, so that u(x) is a unit normal vector of M at each x € M. (There are two 
choices of this vector field, determined up to a sign.) The map x ^ i^(x) from M 
to S d ~ 1 (with S d ~ 1 the unit sphere in R d ) is called the Gauss map. 

One can prove that the Gauss curvature of M near Xo is non-vanishing if and 
only if the Gauss map is a difFeomorphism near xo. Moreover, if doM and da S d-i 
are the induced Lebesgue measures of M and 5 d_1 , and (da S d-i)* the pull-back 
of da S d-i to M defined by 



f(iy~ 1 (x))da S d-i(x), 
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then KdaM = (da S d-i )*, with K the absolute value of the Gauss curvature. 

2. * The spherical maximal function. Define 

Mf)( x ) = f f{x-ty) da(y) 

J S d 

for each t ^ 0, and A*(f)(x) = sup t _^ 0 \A t (f)(x)\. Then 

\\A*(f)\\ L p < CpII/Ulp, if P > d/(d - 1) and d>2. 

As a result, if / G L p , p > d/(d — 1), then limt—o = f(x) a.e. Simple 

examples show that this fails if p < d/(d — 1). 

A hint that there may be estimates for sup t (and in particular that the 

result holds for p = 2 and d > 3) is the following simple observation for d>3: 

II sup \A t (f)\ \\ L 2 < c||/|| L 2. 

l<t<2 

To establish this, one notes that 

j: d - Asi d fM 2 d x ds<c'\\f\\h 

by using Theorem 3.1. However sup!< t < 2 \A t (f)\ < U’) ds + \Ai(f)(x)\, 

hence the assertion follows by using Schwartz’s inequality. 

Refinements of this argument prove the result for sup t>0 p = 2 and 

d > and then also for p > d/(d — 1). Further ideas are needed for the case 
d = 2. 

3. * There is a variant of Problem 2 that applies to the wave equation. 

Suppose u solves A x u = for (a:, t) G x R, with u(x^ 0) = 0, and 费 (x, 0)= 

f(x). If / G L 2 we observe that /(x) in the L 2 (E d ) norm as t —► 0. One 

can show that lim t —o exists and equals f(x) a.e. if / G L p , p > 2d/(d + 1). 

4. * The restriction phenomenon (inequality (31)) is valid in R 2 , for the full range 
l<p< 4/3. 

[Hint: One may dualize the assertion as in the proof of Theorem 5.2. Consider the 
operator 1Z* defined by 

1V(F)( X )= f 产吻 (0 咖 (0. 

Jm 

The desired result then becomes the inequality 

IIW)|| Lg(R2) s A||F|| LP ( 如） 



10. Problems 


407 


when q — 3p' and 1 < p < 4. Now the key point is that if we consider the singular 
measure du — Fdji, then the convolution i/ * i/ is actually an absolutely continuous 
measure / dx with density /, a locally integrable function on R 2 . This fact reflects 
the assumed curvature of M. Indeed, it can be shown that / G L P (R 2 ), with 
f = 爹 + 1， whenever F e L p (dfi) and 1 < p < 4, and ||/||l^(r 2 } < c||H ( 扣 ） and 
1 < p < 4. Now if this is so, then 

7Z*(F) 2 = (i>(-x)) 2 = (i/ * u) A (-x) = f(x), 

and by the Hausdorff-Young inequality, 

||7T(m g ||(7TCP)) 2 || Lr/ = ||/|| Lr ， <c||F|| 2 lp , 

and this proves the assertion since 2r’ = 3p.] 

5. * An analog of Theorem 6.3 for the wave equation is as follows. Let u(x, t) be 
the solution of the wave equation 二 Au for (x } t) G R d x R, with initial data 

u(x, 0) = 0 

餐 ( 工， 0 ) = f(x). 

Then ||u(a:, t)|| L9(RdxR) < c||/|| L 2 (Rd) if 9 and d>3. 

6. * The following further results are known about E(R)- = N(R) — 7r_R 2 , the error 
term appearing in Theorem 8.3. 

(a) The Hardy series RY1T=i Jlihk 1 ’ 2 R) converges for each i? > 0, and 

its sum equals E(R) whenever R ^ fc 1 〆 2 ，for any positive integer k. 

(b) The error E(R) is on the average a multiple of R 1 ^ 2 in the sense that 

E(RfRdR = cr 3 + 0(r 2+e ), 

for some c > 0 and every e > 0. 

(c) However, E(R) is not exactly 0(_R" 2 ) since 

r \E(R)\ 
limsup = oo. 

(d) It has been proved that E(R) — 0(i? Q+e ), for certain a, 1/2 < a < 2/3. A 
relatively recent result of this kind is for a = 131/208. 

7. * The oscillatory integral 3(A) can be identified in terms of Bessel functions of 
the second and third kind. One has that 




3(A) = 4K 0 (2A) - 27ry 0 (2A), 



408 


Chapter 8. OSCILLATORY INTEGRALS IN FOURIER ANALYSIS 


where Y m and Km are respectively Neumann and Macdonald functions. 

8 / Consider the error term in the divisor problem 

△(〆）=Mlogy - (27 - 1)M — 1/4. 

k=l 

It is given for not an integer by the convergent series 

孕 " 1/2 £ 漂 [ 叫 4 以 1/2 " 1/2 )+¥ (4 以 1/2 " 1/2) : ， 

k=l 

For A there are estimates, analogous to those for E in Problem 6, with A(/z)= 
0(^ +e ) and [3 = a/2. 



Notes and References 


Chapter 1 

The first citation is taken from the article [40] by F. Riesz, while the second is 
a translation from an excerpt of Banach’s book [3]. 

General sources for topics in this chapter are Hewitt and Stromberg [23], 
Yosida [59], and Folland [18]. 

For Problem 7*, we refer, for instance, to the book [9] by Carothers, while 
results related to the Clarkson inequalities in Problem 6* can be found in Chap¬ 
ter 4 of Hewitt and Stromberg [23]. For a treatment of Orlicz spaces, see Rao 
and Ren [39]. Finally, in Wagon [57] the reader will find further information on 
the ideas described in Problems 8* and 9. 

Chapter 2 

The first citation is taken from Young’s article [60]. The second citation, trans¬ 
lated from the French, is an extract of a letter from M. Riesz to Hardy. The last 
citation is an extract from a letter from Hardy to M. Riesz. Both are cited in 
Cartwright [10]. In addition, this reference also contains the M. Riesz citation 
in the text in Section 1. 

For the theory of the conjugate function on the circle, analogous to the Hilbert 
transform on the real line, see Chapter VII of Zygmund [61]，and Katznelson [31]. 
The theory of and BMO is treated in Stein [45] where other sources in the 
literature can be found. 

For Problem 6* see for example Chapter III in Stein [45]. 

The proof of the result in Problem 7 can be carried out by complex methods 
using Blaschke products. For the details of this approach in the analogous sit¬ 
uation when the upper half-plane is replaced by the unit disc, see Chapter VII 
in Zygmund [61]. An alternate approach by real methods is, for example, in 
Chapter III of Stein and Weiss [47]. 

Problem 9* is a result of Jones and Journe, which can be found in [28], while 
the reader can consult Coifman et al [38] for results related to Problem 10*. 

Chapter 3 

The first citation is taken from Bochner [7], while the second comes from the 
preface of Zygmund [61]. 

The foundations of distribution theory can be found in the work of Schwartz [41]. 
A further in depth source for distribution theory is Gelfand and Shilov [20], 
which is the first volume of a series of books on the topic. 

Formulations of Theorem 3.2 that are more general, because they require less 
regularity of the kernels of the operators, may be found in Stein [44], Chapter 2, 
and Stein [45], Chapter 1. 

For Problems 5* and 6* see Bernstein and Gelfand [4], and Atiyah [1]. In fact, 
Hormander [26] is also relevant for Problems 6* and 7*. 

Finally, for Problem 8*, see for instance Folland [17], where other references 
may be found, in particular the original work of M. Riesz, Methee, and others. 
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Chapter 4 

The citation is a translation taken from the original work of Baire [2]. 

The proof of the existence of Besicovitch sets using the Baire category theorem 
was originally given in Korner [34]. 

The concept of a universal element defined in Exercise 14 and also discussed in 
Problem 7* comes originally from ergodic theory and the study of dynamical sys¬ 
tems. For a good survey regarding universality, and also the related hypercyclic 
operators see Grosse-Erdmann’s article [21]. 


Chapter 5 

The first citation is taken from an article by Shiryaev on Kolmogorov that ap¬ 
pears in Kolmogorov in Perspective, History of Mathematics, Volume 20, Amer¬ 
ican Mathematical Society, 2000. The second citation is an excerpt of a transla¬ 
tion from [29]. 

There are many good texts for general probability theory and stochastic pro¬ 
cesses. For instance, the reader may consult Doob [13], Durrett [14] and Koralov 
and Sinai [33]. 

For more information on the Walsh-Paley functions in Exercise 16 and Prob¬ 
lem 2 *， the reader may turn to Schipp et al. [42]. The reader will also find 
some information on lacunary series relevant for Problem 2*, in Sections 6 to 8, 
Chapter V in Volume 1 of Zygmund [61]. 


Chapter 6 

Doobs’ citation is from a review of Masani’s book, Norbert Wiener. This review 
appeared in the Bulletin of the American Mathematical Society, Volume 27, 
Number 2, October 1992. 

The following are general sources for material on Brownian motion: Billings¬ 
ley [5] and [6], Durrett [14], Karatzas and Shreve [30], Stroock [52], Koralov and 
Sinai [33], and Qinlar [11]. 

For problems 4* and 7*, see Durrett [14] or Karatzas and Shreve [30]. 

Chapter 7 

Lewy’s citation is from [37]. 

Relevant references for the topics discussed in this chapter, as well as the gen¬ 
eral theory of several complex variables, are Gunning and Rossi [22], Hormander [25], 
and Krantz [35]. 

The approximation result in Theorem 7.1 can be found, for example in Boggess [8], 
Baouendi et al. [15] or Treves [56]. 

For further information on the theory of Cauchy-Riemann equations and 
extensions of some results discussed in this chapter, the reader may turn to 
Boggess [8 . 

More about analysis on the upper half-space U treated in the Appendix and 
its relation to the Heisenberg group can be found in Stein [45], Chapters XII 
and XIII. 

For Problems 1 and 2, see for instance Gunning and Rossi [22] or Krantz [35]. 
Problem 3* is in Chapter 2 of Chen and Shaw [12], while the theory of the 
^-Neumann equation in Problem 4* can be found in Folland and Kohn [19], and 
Chen and Shaw [12]. 
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Finally, for domains of holomorphy in Problem 5* see, for instance, Chap¬ 
ter 2 in Hormander [25], or Chapters 3 and 4 in Chen and Shaw [12], while for 
Problem 6*, see for instance Chapter XIII in Stein [45]. 

Chapter 8 

The epigraph (1840) from Kelvin is taken from [54], while the epigraph of Stokes 
is taken from [48]. 

Some general references for topics covered in Sections 1 to 5 and 7 of this 
chapter are Sogge [43] and Stein [45], Chapters 8-11. We have omitted any 
discussion of the important topic of Fourier integral operators. An introduction 
to this subject is in Sogge [43], Chapter 6, where further references may be found. 

Early work on dispersion equations was by done by Segal, Strichartz [51], 
Ginibre and Velo, and Strauss [49]. A systematic survey and exposition of the 
subject is in Tao [53], where further references to the literature may be found. 

Sources for the results on lattice points in Section 8 are Landau [36]，Part 8; 
Titchmarsh [55], Chapter 12; Hlawka [24]; and Iwaniec and Kowalski [27], Chap¬ 
ter 4. 

For more about the Gauss map discussed in Problem 1* see, for example, 
Kobayashi and Nomizu [32], Sections 2 and 3. 

A treatment of the spherical maximal function can be found in Stein and 
Wainger [46] for d > 3 and Sogge [43] for d = 2. 

For Problem 4*, the restriction theorem when d = 2, see Stein [45], Section 5 
in Chapter 9. 

The result in Problem 5*, in a more general form, is in Strichartz [51]. 

For the results (a)-(c) in Problem 6* concerning r* 2 (A:)，see Landau [36]. The 
exponent a = 131/208 is due to M. N. Huxley. 

The identification of 3 with Bessel-type functions in Problem 7* can be de¬ 
duced from formulas (15) and (25) in Erdelyi [16], and Sections 6.21 and 6.22 
in Watson [58]. With the aid of these formulas one can connect Proposition 8.8 
and Theorem 8.9 in this chapter with Theorem 1 in Strichartz [50], and formulas 
in Sections 2.6-2.9 in Gelfand and Shilov [20]. 

The identity for A(/x) in Problem 8* goes back to Voronoi and in fact predates 
Hardy’s identity for r 2 {k). 
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Symbol Glossary 


The page numbers on the right indicate the first time the symbol or 
notation is defined or used. As usual, Z, Q, M, and C denote the integers, 
the rationals, the reals, and the complex numbers respectively. 



IP space 

2 

LP(X^),L^(X) 


1 ■ IIlp(X) 

II. II 


I- 

|^p, L p norm 

2 

II IIP 

L°° space 

7 

II • IIl°° 


L°° norm or essential-supremum 

8 

CPO 


Continuous functions on X with the sup- 

9 



norm 


A Q 


Holder space of exponent a 

10 

LI 


Sobolev space 

11 



Dual space of B 

12 

B x 


Borel sets of X 

29 

M(X) 


Finite signed Borel measures on X 

29 

C b (X) 


Bounded functions in C{X) 

33 

L Po + L p 

I 

Sum of L Po and L Pl 

36 

AAB 


Symmetric difference of A and B 

36 

[ p ’ r , H 

LP， r 

Mixed space and mixed norm 

38 



Orlicz space 

40 


Functions whose k th derivative are in A Q 

42 



Upper half-plane 

61 

H{f) 


Hilbert transform of / 

62 

^ 2/5 Qy 


Poisson and conjugate Poisson kernels 

63 

。(…） 


O notation 

64 

c^m 


Space of indefinitely differentiable functions 
with compact support on M 

66 

入尸 (a), A(a) 

Distribution function of F 

72 



Real Hardy space 

75 

1 - 1 Hi 


H^(R d ) norm 

75 

/ f 


Truncated maximal function 

76 

• BMO 


Bounded mean oscillation (or BMO) norm 

86 

cn ⑺， p ⑼ 

Smooth functions with compact support 

100 



in or test functions 


巧， |a|, a! 

Partial derivatives and related functions 

100 
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V*(Q) 

Space of distributions on Q 

100 

5 ㈠ 

Dirac delta 

101 

c k ， c k (n) 

Functions of class C k on Q 

101 

S{R d ), s 

Schwartz space, or test functions 

105 

- \n 

Sup-norm for derivatives up to order 
N 

105 

5* 

Space of tempered distributions 

106 

PV0 

Principal value 

111 

A 

Laplacian operator 

119 


Area of unit sphere in W d 

123 

馬，羞，色，桌 

Derivative with respect to 2 and z 

148, 277 

□ 

Wave operator 

155 


(5-neighborhood of A 

177 

畛 

N-fo\d product of TLa 

189 

Z 尹 

Infinite product of Z 2 

191 

r n 

Rademacher functions 

192 

m 0 , a 2 

Mean or expectation, and variance 

196 

& 

Gaussian distribution with mean zero 
and variance a 1 

196 

E 乂 (/), E(/0), E 

Conditional expectation of / with re¬ 
spect to A 

209 

V 

Continuous paths in R d starting at the 
origin 

240 

丁 M 

Stopping time 

254 

p r (z 0 ) 

Polydisc in C n 

277 

(:0) 

Boundary circles of IP r (:o) 

277 

d 

Cauchy-Riemann operator 

291 

Lj 

Tangential Cauchy-Riemann operator 

300 

U 

Upper half-space in C n 

307 

H 2 (U) 

Hardy space on U 

308 

X<Y,X^Y 

X < cY and c~ l y < X < cY for some 
c > 0 

331 

rotcurv(p) 

Rotational curvature 

366 

厂 2 ⑻ 

Number of ways fc is a sum of two 
squares 

377 

d(k) 

Number of divisors of k 

385 

f) 

Hyperbolic measure 

385 
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Relevant items that also arose in Book I, Book II or Book III are listed 
in this index, preceded by the numerals I, II or III, respectively. 


C( n )-normalized bump function, 135 
L p norm, 2 
^-neighborhood ，177 
O notation, 64; (111)12 

affine hyperplane, 17 
algebra, 209 
tail, 215 
allied series, 50 
almost surely, 192 
almost-orthogonality, 372 
amplitude, 325; (1)3; (11)323 
analytic family of operators, 338 
analytic identity, 279 
approximation to the identity, 64; 

(1)49; (111)109 
atomic decomposition, 74 
atoms, 74 
1-atom, 138 
p-atoms, 81 
“faux”，93 

averaging operator, 322, 323, 366 

Banach integral, 24 
Banach space, 9 
equivalent, 46 
Banach-Tarski paradox, 46 
Baouendi-Treves approximation 
theorem, 300 
Bernoulli trials, 205 
Besicovitch set, 176; (111)360, 362, 
374 

bijective mapping, 171 
Blumenthal’s zero-one law, 257 
BMO, 86 

Bochner’s theorem, 292 
Bochner-Martinelli integral, 319 
Borel 

cr-algebra, 29.; (111)23, 267 


measure, 29, 242; (111)269 
sets, 29, 242; (111)23, 267 
Borel-Cantelli lemma, 231; (111)42, 
63 

bounded mean oscillation, 86 
Brownian motion, 227, 240 
recurrent, 274 
transient, 274 

Calderon-Zygmund 
decomposition, 76 
distributions, 135 
cancelation condition, 135 
category 
first, 158 
second, 158 
Cauchy integral 
representation, 277 
upper half-space, 312 
Cauchy-Riemann 

equations, 277; (11)12 
tangential weak sense, 300 
operator, 148 
vector field, 290 
tangential, 291 
Cauchy-Szego integral, 311 
central limit theorem, 195, 220 
characteristic 

function, 216, 221; (111)27 
polynomial, 126; (111)221, 258 
Clarkson inequalities, 45 
class C k ; (1)44 
function, 101， 290 
hypersurface, 288 
closed linear map, 174 
closure of a set, 158 
complete normed vector space, 5 
conditional expectation, 209 
cone, 267 
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INDEX 


backward, 155 
forward, 155 
outside condition, 268 
conjugate 
exponents, 3 
function, 50 

Poisson kernel, 63; (111)255 
convergence in probability, 195 
convex set, 17, 382; (111)35 
convolution; (1)44, 139, 239; (111)74, 
94, 253 

distributions, 102 
functions, 38, 60 
covariance matrix, 221 
critical point, 327; (11)326 
curvature 
form, 333 
Gauss, 333 
principal, 333 
rotational, 366 
total, 333 

cylinder set, 191; (111)316 
cylindrical set, 242 

defining function, 288 
dense set, 158 
differential form, 291 
Dirac delta function, 23, 101; 

(111)110, 285 
Dirichlet 

kernel, 90; (1)37; (111)179 
problem, 264; (1)20, 28, 64, 170; 
(11)212, 215, 216; (111)230 
dispersion equations, 348 
non-linear, 359 
distance 

Hausdorff, 177; (111)345 
distribution, 99, 100 
convolution, 102 
derivative, 101 
finite order, 150 
function, 72 

fundamental solution, 125 
Gaussian, 196 
homogeneous, 115 
joint, 206 
measure, 195, 220 
normal, 196 
periodic, 153 


positive, 150 
principal value, 111 
regular, 117, 132 
support, 104 
tempered, 106 

weak sense convergence, 103 
domain of holomorphy, 320 
Donsker invariance principle, 250 
dual 

exponents, 3 
space, 12 

transformation, 22 
dyadic intervals, 199 

elliptic differential operator, 132 
equivalence, 41 
equivalent Banach spaces, 46 
ergodic, 208; (1)111; (111)294 
error function, 229 
essential-supremum, 8 
event, 192 

expectation conditional, 209 
exponential type, 151; (11)112 

Fourier coefficients, 48; (1)16, 34; 
(111)170 

Fourier series; (1)34; (11)101; 
(111)171 

conjugate function, 50 
decay of coefficients, 173 
diverging at a point, 167 
periodic distributions, 153 
random, 202 

Fourier transform; (1)134, 136, 181; 

( 11)111 

surface-carried measure, 334 
tempered distribution, 108 
fractional derivative, 375 
function 

analytic in C n , 276 
class C k , 101 
convolution, 38, 60 
Dirac delta, 101; (111)110 
expectation, 196 
gauge, 18 

holomorphic in C n , 276 
homogeneous, 115 
mean, 196 

measurable, 209; (111)28 
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mutual independence, 193 
nowhere differentiable, 163, 253; 

(111)154, 383 
Rademacher, 192 
slowly increasing, 107 
support, 28, 104, 146; (111)53 
variance, 196 
Walsh-Paley, 230 
zig-zag, 165 

fundamental solution, 125 

gauge function, 18 
Gauss 

curvature, 333 
map, 405 

Gaussian; (1)135, 181; (111)88 
distribution, 196 
subspace, 228 
generalized function, 99 
generic set, 158 
graph of a linear map, 174 

Hahn-Banach Theorem, 20, 43 
Hamel basis, 183 

Hardy space, 73, 308; (111)174, 203, 
213 

harmonic measure, 254 
Hartog’s phenomenon, 280 
Hausdorff distance, 177; (111)345 
Hausdorff-Young inequality, 49, 57, 
90 

heat 

kernel, 128; (1)120, 146, 209; 

( 111)111 

operator, 127, 133 
Heaviside function, 101; (111)285 
Heisenberg group, 318 
Hessian matrix, 329 
Hilbert transform, 62; (111)220, 255 
Holder 

condition, 10; (1)43 
inequality, 3, 35, 38, 39 
converse, 14 

holomorphic coordinates, 294 
Huygens' principle, 156; (1)193 
hyperbolic measure, 385 
hyperplane, 17 
affine, 17 
proper, 16 


hypersurface, 288 
class C k y 288 
hypo-elliptic, 133 

identically distributed functions, 205 
injective mapping, 171 
interior of a set, 158; (111)3 
invariance principle (Donsker), 250 
invariant set, 207; (111)302 
iterated logarithm, 237, 275 

Jensen’s inequality, 40 
John-Nirenberg inequalities, 95 
joint distribution, 206 

Khinchin’s inequality, 203 

Laplacian, 119, 126; (1)20, 149, 185; 

(II) 27; (111)230 
lattice points, 377, 379 
law of large numbers, 213 

law of the iterated logarithm, 237, 
275 

Lebesgue’s thorn, 275 
Levi form, 295 
Lewy 

example, 313 
extension theorem, 306 
linear functional, 11; (111)181 
bounded, 11 
continuous, 11 
linear transformation 
bounded, 21 
Liouville numbers, 185 
Lipschitz 

boundary, 272 

condition, 10, 146; (1)82; (111)90, 
147, 151, 330, 362 

martingale sequence, 211 
complete, 211 

maximal function, 70, 76, 85; 

(III) 100, 261 
spherical, 406 

maximum principle, 296; (1)92; 
(111)235 

meager set, 158 
mean, 196 
measurable, 209 
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INDEX 


measure 

Borel, 29, 242 
continuous, 218 
harmonic, 254 
hyperbolic, 385 
Radon, 28, 100 
Minkowski inequality, 4 
for integrals, 37 
mixed norm, 38 
mixing, 207; (111)305 
multiplier, 134; (111)220 
mutual independence, 193 
function, 193 
sub-algebras, 211 

non-linear dispersion equation, 359 
norm, 9, 21 

of a continuous linear functional, 
12 

normal 

distribution, 196 
number, 231; (111)318 
normed vector space, 3, 9 
nowhere dense set, 158 
nowhere differentiable function, 163, 
253; (1)113, 126; (111)154, 383 

open mapping, 171 

open mapping theorem, 171; (11)92 

Orlicz space, 41, 45 

oscillation of a function, 161; (1)288 

outside cone condition, 268 

parallelogram law, 41, 45; (111)176 
parametrix, 131 
partition of unity, 28 
path, 223 

periodization operator, 153 
phase, 325; (1)3; (11)323 
Poisson 

kernel, 63; (1)37, 55, 149, 210; 

(II) 67, 78, 109, 113, 216; 

(III) lll, 171, 217 
conjugate, 63; (1)149; (11)78, 113; 

(111)255 

Poisson summation formula, 379; 

(1)154-156, 165, 174; (11)118 
polydisc, 277 
principal 


curvatures, 333 
value, 111 
probability 

convergence, 195 
measure, 192, 195 
weak convergence, 219 
space, 192 
process 

stationary, 232 
stochastic, 239 
stopped, 261 
Prokhorov’s lemma, 243 
proper hyperplane, 16 
pseudo-convex, 296 
strongly, 296 

Rademacher functions, 192 
Radon 

measure, 28, 100 
transform, 363; (1)200, 203; 
(111)363 
random 
flight, 237 
Fourier series, 202 
variable, 190 
walk, 222 
recurrent, 223 
recurrent 

Brownian motion 
neighborhood, 274 
point wise, 274 
random walk, 223 
reflection, 63 
regular 

distribution, 117, 132 
point, 257 

restriction (L p , L q ), 345 
Riemann-Lebesgue lemma, 93; 

(1)80; (111)94 
Riesz 

convexity theorem, 57 
diagram, 57 

interpolation theorem, 52 
product, 235 
rotational 

curvature, 366 
matrix, 365 

Schrodinger equation, 348 
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Schwartz space, 105; (1)134，180 
second fundamental form, 333 
section, 243 
separable 
L p space, 36 
Banach space, 43 
measure space, 36 
set 

Borel, 29, 242 
closure, 158; (11)6 
convex, 17, 382; (11)107 
cylinder, 191; (111)316 
cylindrical, 242 
dense, 158 
first category, 158 
generic, 158 
interior, 158; (11)6 
invariant, 207 
meager, 158 
nowhere dense, 158 
second category, 158 
strongly convex, 383 
signature, 294 
signum, 14 

singular integral, 62, 134 
Sobolev 

embedding, 151; (111)257 
space, 11, 151 

spherical maximal function, 406 
stationary 
process, 232 

stationary phase, 325, 398; (11)326 
stochastic process, 239 
stopped process, 261 
stopping time, 254, 255 
Strichartz estimates, 351 
strong Markov property, 258 
strong solution, 360 
strongly convex set, 383 
strongly pseudo-convex, 296 
sub-algebra, 209 
support 

distribution, 104 
function, 104; (111)53 
of a function, 28, 146 
surface-carried measure, 334 
smooth density, 334 
surjective mapping, 171 

tail algebra, 215 


tangential 

Cauchy-Riemann vector field, 291 
vector field, 290 

Tchebychev inequality, 73; (111)91 
tempered distribution, 106 
test functions, 100, 105 
three-lines lemma, 53, 339; (11)133 
Tietze extension principle, 269 
tight, 33, 243 
total curvature, 333 
type (of an operator), 56 

uniformly convex, 45 
universal element, 184 
upper 

half-plane, 61 
half-space, 307 

van der Corput inequality, 328 
variance, 196; (1)160 
vector field, 290 

Walsh-Paley functions, 230 
wave operator, 155 
weak 

boundedness, 184 
compactness of L p , 37 
convergence, 37, 221, 243; 

(111)198 
weak sense 

continuity, 108 
convergence, 103 
derivative, 101 
derivative in L p , 10 
tangential Cauchy-Riemann 
equations, 300 
weak* convergence, 44 
weak-type, 92 

weak-type inequality, 71; (111)101 
Weierstrass approximation theorem, 
299; (1)54, 63, 144, 163 
Weierstrass preparation theorem, 
282, 319 

Wiener measure, 240, 241 

Young’s inequality, 39, 40, 60 
Yukawa potential, 149 

zero-one law, 199, 215 
zig-zag function, 165 



